K
K
Kirill Krasin2017-10-13 10:33:31
PHP
Kirill Krasin, 2017-10-13 10:33:31

PHP parsing data from an HTML page. How to implement with SimpleHTMLDOM or similar?

Hello, I need to parse a URL in order to get data and put it into an array. The data on the page is updated periodically, there is a trigger on which to run the script for parsing and placing the received data in the database. I would like to see examples and tips on proper parsing. Thanks in advance.
The structure of the required block:

<section class="SECTION_CLASS">
<div class="container">
<changelists>
<changelist>
<version>VER</version>
<information>DATE</information>
<ul>
<li><changetype class="CLASS_A">new</changetype><p> CHANGED</p></li>
<li><changetype class="CLASS_B">removed</changetype><p> REMOVED</p></li>
...
</ul>
</changelist>
...
</changelists>
</div>
</section>

Answer the question

In order to leave comments, you need to log in

1 answer(s)
M
matperez, 2017-10-13
@matperez

Have you already tried something? In principle, the example from the project page should be enough

// Дергаете страницу
$html = file_get_html('http://slashdot.org/');

// Находите информацию по селектору CSS
foreach($html->find('div.article') as $article) {
    // парсите результат и закидываете в массив
    $item['title']     = $article->find('div.title', 0)->plaintext;
    $item['intro']    = $article->find('div.intro', 0)->plaintext;
    $item['details'] = $article->find('div.details', 0)->plaintext;
    $articles[] = $item;
}

print_r($articles);

find('selector') will return all found elements matching the selector as an array. this is suitable for elements like changelists -> changelist.
find('selector', 0) will return the first matched element - you should use it if you know there is only one. this is suitable for elements like changelist -> version or changelist -> infofmation.

Didn't find what you were looking for?

Ask your question

Ask a Question

731 491 924 answers to any question