Answer the question
In order to leave comments, you need to log in
How to automatically validate html?
There was a need to validate many html files. Arrange indents (or at least not delete existing ones), close unclosed tags, and so on.
Of the online services, I did not find those that close the tags, only indent them.
From the Tidy programs, but she is weird and does something completely different. text turns into text
I tried to use html5lib for python, but nothing good came of it either.
import html5lib
def pars(html):
parser = html5lib.HTMLParser(tree=html5lib.getTreeBuilder("dom"))
dom_tree = parser.parseFragment(html)
walker = html5lib.getTreeWalker("dom")
stream = walker(dom_tree)
s = html5lib.serializer.htmlserializer.HTMLSerializer(omit_optional_tags=False)
return u''.join(s.serialize(stream))
res = pars(u'html code')
print res
Answer the question
In order to leave comments, you need to log in
Didn't find what you were looking for?
Ask your questionAsk a Question
731 491 924 answers to any question