Answer the question
In order to leave comments, you need to log in
Why in python 3 lxml.html.document_fromstring doesn't output what it should?
In all examples, the code below outputs 'Hello World', but mine: <Element html at 0x2ab9540>
Can you please tell me what's the problem here?
data = """<html>
<head>
</head>
<body>Привет мир</body>
</html>"""
html = lxml.html.document_fromstring(data)
print (html)
Answer the question
In order to leave comments, you need to log in
>>> import lxml.html
>>> html = lxml.html.fromstring('''\
... <html><body onload="" color="white">
... <p>Hi !</p>
... </body></html>
... ''')
>>> print lxml.html.tostring(html)
<html><body onload="" color="white"><p>Hi !</p></body></html>
>>> print lxml.html.tostring(html)
<html> <body color="white" onload=""> <p>Hi !</p> </body> </html>
>>> print lxml.html.tostring(html)
<html>
<body color="white" onload="">
<p>Hi !</p>
</body>
</html>
1. Use unicode.
2. Refer to text by tags:
from lxml import html
data = u"""<html>
<head>
</head>
<body>Привет мир</body>
</html>"""
html = html.document_fromstring(data)
print html.head.body.text
In [1]: Привет мир
Didn't find what you were looking for?
Ask your questionAsk a Question
731 491 924 answers to any question