J
J
jacksparrow2015-03-02 13:30:27
PHP
jacksparrow, 2015-03-02 13:30:27

Why does habr behave incorrectly when parsing?

Using the library, I'm trying to parse the page from Habr, but it sometimes does not work, file_get_content gives a similar result

$url='http://habrahabr.ru/post/251871/';
$html = SimpleHTMLDom::file_get_html($url);
echo $html;die;

The result can be both a normal html code and https://monosnap.com/image/1kQHsjYpsBgLnFD0YlkYW60...
Moreover, for one page, it can first be negative several times, then after 5 minutes, it loads it normally.
P.S. I know about the presence of habr api, but access to it is closed. Similar questions do not address this particular problem or do not provide an answer. Thanks in advance

Answer the question

In order to leave comments, you need to log in

1 answer(s)
D
direx, 2015-03-02
@ponich

Try to log in normally as a normal user on Habr. With UserAgent and Cookie. To do this, use CURL, or send the necessary headers via file_get_content and feed the already parsed HTML code to your library

Didn't find what you were looking for?

Ask your question

Ask a Question

731 491 924 answers to any question