I
I
Ilya Saveliev2018-12-04 09:39:18
Parsing
Ilya Saveliev, 2018-12-04 09:39:18

Parsing content on Wordpress - how can this be done?

Given : several sites, the headings of which you need to parse yourself.
Problem : these sites do not have a common URL under which the entries are located + RSS is missing.

Example: you need to parse news, the list of which is located on site.ru/news/, and the url of detailed news is site.ru/nazvanie-novosti/. That is, there is no common feature in the URL

aftparser plugins the parser is dead - it doesn't work with php version >= 7, wpgrabber works either by html or by RSS source - in neither case can it be configured.
Advise a solution?

Answer the question

In order to leave comments, you need to log in

5 answer(s)
L
LegoG, 2020-02-27
@LegoG

For Wordpress there is a plink.top parser, paid, but it works exactly as you need

A
Alexander Sobolev, 2018-12-04
@san_jorich

Do you have access to admins?

U
uRoot, 2018-12-04
@uroot

I know paid plugins for WP: Scrares - here you visually indicate where the text is, where the picture is, and where the heading is. Plugin side where to parse content.

O
Orkhan Hasanli, 2019-02-04
@azerphoenix

Example: you need to parse news, the list of which is located on site.ru/news/, and the url of detailed news is site.ru/nazvanie-novosti/. That is, there is no common feature in the URL

Yes, indeed, and I was not able to parse such pages using WPGrabber.
Why not write your own parser in PHP or any other language and put it on CRON? Roughly speaking, from /news get a list of links, go through the list and parse the content. And then generate a sql file from it and feed it to the VI (or, alternatively, parse it into an xls table, and then import it to the site with the WP All Import plugin + create a CRON task to import the table with the WP All Import plugin).
If you need to parse one-time content, then the Visual Web Ripper is a very good program. You visually choose what to parse, specify pagination, etc., and then the program parses and forms a table. It remains to import using WP All Import.

Q
qwers.com, 2020-12-30
@qwers.com

wpgrabber qwew.ru/wpgrabber/2525-wpgrabber-4-9-8.html

Didn't find what you were looking for?

Ask your question

Ask a Question

731 491 924 answers to any question