Answer the question
In order to leave comments, you need to log in
How to parse invalid HTML?
Please tell me how to parse invalid HTML?
Previously, I always used Simple HTML DOM, the result \ speed was fine, but it does not work with invalid HTML - it goes into recursion.
Answer the question
In order to leave comments, you need to log in
Однозначно сначала Tidy. Отлично исправляет весь невалид
Tidy. And there is no need to look for alternatives.
Here is one of my use cases for example.
$options = array("indent" => false,
"output-xml" => true,
"clean" => true,
"drop-proprietary-attributes" => true,
"drop-font-tags" => true,
"drop-empty-paras" => true,
"hide-comments" => true,
"join-classes" => true,
"join-styles" => true,
"show-body-only" => false);
$tidy = new tidy();
$str = $tidy->parseString($page, $options, 'utf8'); // $page содержит невалидный html
$tidy->cleanRepair();
echo $tidy; // валидный html
Didn't find what you were looking for?
Ask your questionAsk a Question
731 491 924 answers to any question