Website parsing, how to bypass protection?

D

del4pp2020-04-26 20:09:11

Python

del4pp, 2020-04-26 20:09:11

Hello.
When parsing a site from a home PC, it gives out the necessary information, but when I use a server + proxy, each page of the site has the same html page structure, which is why the parser catches the 403 response
Tried different proxies, who can tell me how to get around?

Reply

Answer the question

In order to leave comments, you need to log in

2 answer(s)

S

Sergey Karbivnichy, 2020-04-26
@del4pp

I recommend the program RSocks Proxy Checker There are versions for linux. Upload a list of proxies into it, and specify the site to check 'ruru.hotmo.org'. At the end of the check, sort the results by "200 OK" and save such proxies. Just tested in python, it works.

D

Dimonchik, 2020-04-26
@dimonchik2013

reliable proxies
are checked in advance on a specific site
Google and monsters (like Amazon) just don’t break through