C
C
Crash2017-01-08 18:07:55
Parsing
Crash, 2017-01-08 18:07:55

How can you "humanize" a web scraper?

I have some experience in writing and operating parsers, and by this time I have encountered a problem - they are blocked. Requests from the parser are perceived as automatic, access to the site is closed and they are asked to enter a captcha, or it can be even worse: an abuse is sent to the hosting and they can block the whole VPS.
How can parsers be made more "human"? So that the work of the parser is practically no different from the work of a real user. How and what can be imitated? Of course, you can parse through a proxy, but this is a paid pleasure and generally does not solve the problem.

Answer the question

In order to leave comments, you need to log in

2 answer(s)
Z
Zelimkhan Beltoev, 2017-01-08
@Beltoev

Selenium, delays between requests, different user-agents

O
Oleg Gamega, 2017-01-08
@gadfi

Of course, you can parse through a proxy, but this is a paid pleasure.

but you make paid solutions, what's the problem then?
there is no single solution, there is a set of measures, including browser emulation (different), proxies, delays between calls, and much more.
yes, proxies are a paid solution, and to minimize costs, if you can do without them, you can do without them, but in some cases this is not possible

Didn't find what you were looking for?

Ask your question

Ask a Question

731 491 924 answers to any question