S
S
sergey19892016-01-22 23:04:55
PHP
sergey1989, 2016-01-22 23:04:55

Parser from Yandex in WordPress?

Good day. I want to try writing a news parser with Ya in WordPress. Please advise which library is better to use for such purposes? I am currently planning PHP Simple Dom + Curl. And is it better to write the parsing result to the WordPress posts table and run the scripts through Cron or display the results directly through the parser itself? And what reefs can be in such task?

Answer the question

In order to leave comments, you need to log in

3 answer(s)
D
Daemon23RUS, 2016-01-22
@Daemon23RUS

Dig towards RSS
https://news.yandex.ru/export.html

S
Silm, 2016-01-22
@Silm

If you parse html, then the tools are basically correct, maybe you should compare PHP Simple Dom with other libraries for parsing dom and it’s nice to wrap curl with something to make it more pleasant to send requests and receive responses.
If you work with a feed, then you will not need to parse the dom.
If the results of the parsing will form the permanent content of the site, then into the table. It really depends on the details of your idea.
Yandex is quite jealous of the parsing of its content. After a certain number of suspicious requests, you can start receiving captcha. You'll have to get around it somehow.

Y
Yuri, 2016-01-23
@riky

From Yandex, a list of 5 news with links is very easy to parse.
but the pitfall is that Yandex only gives links to other sites, and the content will have to be parsed from them.
and parsing the main content of an arbitrary site is a task over 100 lvl

Didn't find what you were looking for?

Ask your question

Ask a Question

731 491 924 answers to any question