L
L
Leon10102019-10-16 12:51:37
JavaScript
Leon1010, 2019-10-16 12:51:37

How to scrape sites with anti-parse protection?

An API is required that can return the html code of the page (after passing protection with js redirects), the url of which was sent to it. There is a Variti service that essentially proxies requests to the site, giving a verification page with js hash generation from browser parameters and subsequent redirection. An example of a site using the service: bi-bi.ru
Accordingly, it is not possible to get the html code of a real page through curl.
At the same time, such a service as import.io and others like it can bypass this protection. But I need to be sure to pull out the entire HTML code of the page without first adding the url to the service constructor.
Please tell me the solution.

Answer the question

In order to leave comments, you need to log in

Didn't find what you were looking for?

Ask your question

Ask a Question

731 491 924 answers to any question