F
F
Fridary2016-07-25 00:57:00
PHP
Fridary, 2016-07-25 00:57:00

How to bypass blocking access to the site when parsing?

There are more than a million documents on the ras.arbitr.ru resource , I need to get their content by parsing.
I wrote a simple python script. Searching on their site is done by sending POST headers. But as soon as I start the script, the resource blocks me and I can’t access their site for 1 day, it says “Protect the system by your IP from scripts”
I send the headers copied from the Chrome dev tool.
Tried to set random cookies to values ​​- does not help.
How can I bypass the blocking? I would be very grateful to anyone who can advise.

payload = {"GroupByCase":False,"Count":count,"Page":page,"DateFrom":"2000-01-01T00:00:00","DateTo":"2030-01-01T23:59:59","Sides":[],"Judges":[],"Cases":[],"Text":""}
headers = {
  "Accept":"application/json, text/javascript, */*",
  "Accept-Encoding":"gzip, deflate",
  "Accept-Language":"en-US,en;q=0.8,ru;q=0.6",
  "Connection":"keep-alive",
  "Content-Length":"149",
  "Content-Type":"application/json",
  "Cookie":"ASP.NET_SessionId=eob3w5vypepmykpcsixfpxyv; __utmt=1; CUID=49784dc2-a97e-4249-8c61-415fe5f6f081:QNsAJT4ya5WN7jeL7jCECg==; __utma=160997822.296078651.1469210605.1469210605.1469257019.2; __utmb=160997822.4.10.1469257019; __utmc=160997822; __utmz=160997822.1469210605.1.1.utmcsr=(direct)|utmccn=(direct)|utmcmd=(none)",
  "Host":"ras.arbitr.ru",
  "Origin":"http://ras.arbitr.ru",
  "Referer":"http://ras.arbitr.ru/",
  "User-Agent":"Mozilla/5.0 (Macintosh; Intel Mac OS X 10_10_5) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/51.0.2704.103 Safari/537.36",
  "X-Requested-With":"XMLHttpRequest"
}
r = requests.post("http://ras.arbitr.ru/Ras/Search", data=json.dumps(payload), headers=headers)

Answer the question

In order to leave comments, you need to log in

2 answer(s)
R
ragimovich, 2016-07-25
@fridary

Your headers mean nothing - the ban goes by IP when the number of requests is exceeded. Discover proxy servers and enjoy life.

R
Rou1997, 2016-07-25
@Rou1997

Cookies have nothing to do with IP blocking, use proxies, anonymizers and Tor (many SOCKS proxies).

Didn't find what you were looking for?

Ask your question

Ask a Question

731 491 924 answers to any question