E
E
Eugene2015-04-28 11:34:04
Parsing
Eugene, 2015-04-28 11:34:04

How to parse the status of 10,000 goods every hour with the least "blood"?

Hello,
There is a need to parse the status of about 10 thousand commodity items every hour. Each product is a separate page with the status of the product. How to do this with the least "blood", both for the donor server and for the load on the server where the parser will work?
Thanks to.

Answer the question

In order to leave comments, you need to log in

3 answer(s)
S
Sergey, 2015-04-28
Protko @Fesor

if there is access to the server to the donor, then ask to make a notification. And instead of parsing, you will only need to hang a demon that will accept some requests that something has changed. Then there will be no idle traffic at all.
In this case, when the product is changed, a notification is written to the queue that something has changed and possibly infa about what exactly has changed. On your server, the client listens to the queue and as soon as something appears there, it takes it for processing. This is relatively easy to implement, and there is no overhead.

S
sim3x, 2015-04-28
@sim3x

0.

with the least blood
ask to make api
1. Request HEAD and check if the page has changed
Based on everything, it is worth making a queue and spreading requests as evenly as possible over time.
Your server will suffer little from the load, because it will only need to receive traffic and parse
Well, cover your IP with proxies

G
Gluck Virtualen, 2015-04-28
@gluck59

Why parse?
"Nineties 2.0" or what?

Didn't find what you were looking for?

Ask your question

Ask a Question

731 491 924 answers to any question