K
K
Karen Kratyan2021-08-19 20:18:57
Parsing
Karen Kratyan, 2021-08-19 20:18:57

Is there a solution (extension, service) for collecting (parsing) news from HTML, storing, and displaying?

It is necessary to receive data from HTML (and not from RSS) from pages, store and display with sorting ... Basically, there are parsing with export to some format. And you need to get data from HTML (text, link, date, etc.), store, display.

Upd.
We need something similar to an RSS reader, but which would work with HTML (set selectors / xpath of the name, url, image, date, description for each resource, etc.), parse and write this data to the database (well, the output is optional).

Answer the question

In order to leave comments, you need to log in

2 answer(s)
V
ValdikSS, 2021-08-23
@kratkar

https://github.com/mozilla/readability
There are many alternative implementations of this algorithm in other languages.

Didn't find what you were looking for?

Ask your question

Ask a Question

731 491 924 answers to any question