How to remove using XPath?

J

JRazor2015-03-21 19:02:56

Python

JRazor, 2015-03-21 19:02:56

Hello.
There was a question on Xpath. There is this code:

<span class="less-review">I visited this clinic as my wisdom tooth is growing horizontally resulting in bleeding gums,the doctor examined and said the tooth has to go out 
<br/>it might result in the surrounding teeth&#039;s going bad  she explained 
<br/>everything in very layman terms as to why we would extracting it 
<br/>have made reservations with them to get the tooth extracted,overall a  nice experience, 
<br/>
<br/></span>

When parsing, I want to remove these tags and parse the entire comment into one line without resorting to crutches, such as parsing child elements and then combining them. Can I do it in terms of XPath?
Thank you very much in advance

Reply

Answer the question

In order to leave comments, you need to log in

5 answer(s)

A

Andrey K, 2015-03-21
@mututunus

''.join(html.xpath('/span/text()'))

I

Igor Nikolaev, 2015-03-22
@nightvich

hxs = HtmlXPathSelector(response)
data = hxs.xpath('/span/text()').extract()

M

mukizu, 2015-03-26
@mukizu

I think you need to dig towards the normalize-space () function: stackoverflow.com/questions/11007527/xpath-to-get-...

I

Ilya, 2015-03-26
@glebovgin

Wouldn't it work similar to php?

$query = $xpath->query('//span[@class="less-review"]');
$query->item(0)->nodeValue; // тут уже чистый текст без лишних тегов.

M

MrCarlione, 2015-05-21
@MrCarlione

I can't check, but I think it's necessary to take the parent tag for the block you specified and apply the text() function. For example, if the genus div tag, then the expression will take the form "//div/text()". The entire text of the block without tags should get into the variable.