T
T
tucnak2011-08-17 14:56:34
Python
tucnak, 2011-08-17 14:56:34

Python and XML

I am new to the Python language. Now I decided to learn how to work with XML. As usual, difficulties arise.

On a Windows 7 computer, Python 2.7

document.xml

<document>
  <name>Illya Kovalevskiy</name>
  <hobby>Computer Programming</hobby>
</document>


xml-parser.py

from xml.dom.minidom import *

xml = parse('document.xml')
name = xml.getElementsByTagName('name')

for node in name:
  print node


I launch, issues - DOM Element: name at 0xblablab

Sorry for the very clumsy analysis.

Point out errors in tree parsing and explain how to read text between tags.

Specifically, the analysis of a previously known tree is of interest.

PS Sorry for the grammatical errors, in the 7th grade of the Kiev Lyceum they don't teach Russian.

Answer the question

In order to leave comments, you need to log in

6 answer(s)
S
shsmad, 2011-08-17
@tucnak

voila

from xml.dom.minidom import *

xml = parse('document.xml')
name = xml.getElementsByTagName('name')

for node in name:
  print node.childNodes[0].nodeValue

B
bekbulatov, 2011-08-17
@bekbulatov

With lxml you can

from lxml import etree
tree = etree.parse('document.xml')
print tree.xpath("/document/name/text()")
print tree.xpath("/document/hobby/text()")

Tutorial

A
Anatoly, 2011-08-17
@taliban

Damn, well, you give.
1. I don't know python
2. I downloaded python
3. I ran your script
4. I got your result
5.!!!
6. PROFIT


from xml.dom.minidom import *

xml = parse('document.xml')
name = xml.getElementsByTagName('name')

for node in name:
  print dir(node)

Conclusion: The new generation is lazy as hell, you know what's behind the exclamation marks? ( ras and two ["python object method list" google first link])

D
dutchakdev, 2011-08-17
@dutchakdev

Personally, I'm far from a pro in Python, but I really like the language. Now loaded and no time to sow and delve into it.
But for now, I can only give birth to this:
#demo.xml

<document>
  <name>AcidSlayer</name>
  <hobby>Python</hobby>
</document>

#habra.py
#Тут поняно
from xml.dom.minidom import parseString
#Берем фалйлик
file = open('demo.xml')
#Конвертим его в string
data = file.read()
#Тут понятно
file.close()
#Парсим сам файл
dom = parseString(data)
#Полчаем хабра теги
nameTag = dom.getElementsByTagName('name')[0].toxml()
hobbyTag = dom.getElementsByTagName('hobby')[0].toxml()
#Убераем лишнее
name=nameTag.replace('<name>','').replace('</name>','')
hobby=hobbyTag.replace('<hobby>','').replace('</hobby>','')
#Выводим
print name
print hobby

Something like this.

A
Ano, 2011-08-17
@Ano

Well, you give everything. Are you aware that python has documentation? With examples.
docs.python.org/library/xml.dom.minidom.html
Are you aware that DOM principles are the same everywhere, and there are text nodes?

>>> dom = parseString('<doc><name>Non nom</name><hobby>python</hobby></doc>')
>>> textnode = dom.getElementsByTagName('name')[0].childNodes[0]
>>> print textnode
<DOM Text node "u'Non nom'">
>>> textnode.nodeType == textnode.TEXT_NODE
True
>>> textnode.nodeValue
u'Non nom'

F
FloppyFormator, 2011-08-18
@FloppyFormator

Have you tried the wonderful ElementTree library? It seems like parsing XML with this thing is much easier.
PS Some people write much worse in Russian, but they don't even think of apologizing. You don't make any mistakes :)

Didn't find what you were looking for?

Ask your question

Ask a Question

731 491 924 answers to any question