C
C
Chichi2015-11-29 14:07:03
Parsing
Chichi, 2015-11-29 14:07:03

How to split data from cells from a table using Xpath?

I have the following html table:

<table class="info">
<tbody>
    <tr><td class="name">Year</td><td>2011</td></tr>
    <tr><td class="name">Storey</td><td>3</td></tr>
    <tr><td class="name">Area</td><td>170</td></tr>
    <tr><td class="name">Condition</td><td>Renovated</td></tr>
    <tr><td class="name">Bathroom</td><td>2</td></tr>
</tbody>
</table>

In this table, information is organized in such a way that each row contains two cells enclosed in tags <td>. The first cell contains information about the data type. For example, the year of construction (Year). The second cell contains the year itself (2011).
I want to extract the data in such a way that the data type and the information itself were separated, and the information from the cells corresponded to each other like this:
Year: 2011
Storey: 3
Area: 170
Condition: Renovated
Bathroom: 2
I want to get each row and data from cells so that the information can be spread across different columns in Excel. The type data in the first column and the data itself in the second column.
At the moment we have the following Xpath code:
//table[@class="info"]//tr//td/text()
It returns data in a single stream in the following format:
Year
2015
Storey
3
Area
170
Condition
Renovated
I would like to extract rows and cells respectively so that I can was to put them in excel spreadsheet by different columns:
Year (1st excel column): 2011 (2nd excel column)
Storey (1st excel column): 3 (2nd excel column)
How to do it with Xpath ?

Answer the question

In order to leave comments, you need to log in

1 answer(s)
T
throughtheether, 2015-11-29
@ChicoId

//table[@class="info"]//tr//td/text()
What is the purpose of using "//" between tr and td if td is an immediate child of tr? In my opinion, it is better to specify the most specific xpath expression. Also, please clarify in which environment (programming language) you are using these expressions.
If you rewrite your expression like this
//table[@class="info"]/tbody/tr/td[1]/text(), you will get the values
​​Year, Storey, Area, Condition, Bathroom.
Similarly , it
//table[@class="info"]/tbody/tr/td[2]/text()will give
2011, 3, 170, Renovated, 2.
Then you can combine both lists using the programming language you are using.
Or you can get a list of nodes - table rows:
//table[@class="info"]/tbody/trand then, iterating over them, get the values ​​of expressions td[1]/text()and td[2]/text().

Didn't find what you were looking for?

Ask your question

Ask a Question

731 491 924 answers to any question