L
L
Leo2015-03-27 13:20:39
Amazon Web Services
Leo, 2015-03-27 13:20:39

How to bypass crawler ban on AWS?

The situation is this. To fill the database, you need to parse site N. The parser itself is written and works through a proxy on Amazon.
But there is still information on the site page that can be obtained by sending a well-formed http request. On the local machine, everything works out, but through the proxy at N, Amazon is apparently in the bath and does not send the necessary information.
The question is if anyone has any ideas how to get around this. At the moment, I had to reduce the load and make an additional request (for that very additional information) through the home proxy, but this is not the best option.
Competing with N in finding free proxies and so on is pointless, and they ban very often by IP.
Yes, you can give all users a browser extension that will make the same requests 1 time per minute, but then the user will be able to change the data sent, and this is not desirable.
Thank you!

Answer the question

In order to leave comments, you need to log in

Didn't find what you were looking for?

Ask your question

Ask a Question

731 491 924 answers to any question