How to parse a site that has a very interesting anti-bot?

B

Boriol2018-09-18 13:03:07

Python

Boriol, 2018-09-18 13:03:07

Parsing blocking was enabled on the rbt.ru website . After sending a request to their site, there is a redirect to some third-party service ohio8.vchecks.info/.....
At the end of the js script that is there, 3 parameters are generated that are needed to form a new url and generate cookies.
Can anyone help me bypass this protection?

Reply

Answer the question

In order to leave comments, you need to log in

4 answer(s)

A

alex5e, 2018-09-18
@alex5e

As a 100% option, you can consider headless-Chrome with a webdriver for python, but this will require more resources than a regular http client

import time

from selenium import webdriver
import selenium.webdriver.chrome.service as service

service = service.Service('/path/to/chromedriver')
service.start()
capabilities = {'chrome.binary': '/path/to/custom/chrome'}
driver = webdriver.Remote(service.service_url, capabilities)
driver.get('http://www.google.com/xhtml');
driver.quit()

D

Danil Sapegin, 2018-09-18
@ynblpb_spb

phantomjs / casperjs

M

Maxim Timofeev, 2018-09-18
@webinar

I join my colleagues and add software if I don’t feel like using pens:
sbfactory.ru/?p=600