Answer the question
In order to leave comments, you need to log in
What architecture to use for a news aggregator?
After some thought, the following picture emerges in my head: The bot in an endless loop accesses sites (rss/saitmap/parsing the news page) and receives a list of news links, adds them to the database (redis/modgoDB). The second bot, also in a loop, follows the links and parses the news, after which it sends them to the site api for further processing and adding to the main database.
There are a few questions left: How can bots/streams be synchronized in order to avoid duplication of news, how to set the scanning interval of a news resource depending on the time of day, which architecture is more suitable for these purposes?
Answer the question
In order to leave comments, you need to log in
for flow management, as well as task scheduling, it is convenient to use, for example , akka
Didn't find what you were looking for?
Ask your questionAsk a Question
731 491 924 answers to any question