Answer the question
In order to leave comments, you need to log in
How to extract text between tags?
We have the following code, an XML parser:
import xml.etree.ElementTree as ET
doc = """
<?xml version="1.0" encoding="ANSI" ?>
<data>
<items>
<item name="item1">1</item>
<item name="item2">2</item>
<item name="item3">3</item>
<item name="item4">4</item>
</items>
</data>
.----------------------------------------------------------
"""
tree = ET.fromstring(doc)
print(tree.find('.//item[@name="item1"]').text)
print(tree.find('.//item[@name="item4"]').text)
Answer the question
In order to leave comments, you need to log in
lxml
>>> import lxml.etree
>>> doc = """
... <?xml version="1.0" encoding="ANSI" ?>
... <data>
... <items>
... <item name="item1">1</item>
... <item name="item2">2</item>
... <item name="item3">3</item>
... <item name="item4">4</item>
... </items>
... </data>
... .----------------------------------------------------------
... """
>>> parser = lxml.etree.XMLParser(recover=True)
>>> tree = lxml.etree.fromstring(doc, parser)
>>> [element.text for element in tree.iter('item')]
['1', '2', '3', '4']
>>> import xml.etree.ElementTree as ET
>>> doc = """
... <?xml version="1.0" encoding="ANSI" ?>
... <data>
... <items>
... <item name="item1">1</item>
... <item name="item2">2</item>
... <item name="item3">3</item>
... <item name="item4">4</item>
... </items>
... </data>
... .----------------------------------------------------------
... """
>>> tree = ET.fromstring(doc.strip('\n-.'))
>>> [element.text for element in tree.iter('item')]
['1', '2', '3', '4']
Didn't find what you were looking for?
Ask your questionAsk a Question
731 491 924 answers to any question