A
A
Alexander2022-03-13 23:17:59
PHP
Alexander, 2022-03-13 23:17:59

How to scrape a site that is protected from parsing using PHP?

Previously, there was a parser on the site that worked successfully. Lately I’ve been watching that there is silence, I open it, and there all the logs are in 403 errors.
I tried different methods found in Google, it gives either a 403 error, or "Please wait while we check your browser" ...
Now I use this code, but it does not work ...

<?php
$url = 'https://www.osta.ee/ru/zavershajutsja';
$options = [
  'http'=> [
    'user_agent' => 'Mozilla/5.0 (Windows NT 10.0; rv:78.0) Gecko/20100101 Firefox/78.0',
    'protocol' => 1.1,
    'header' => [
        'Accept: text/html,application/xhtml+xml,application/xml;q=0.9,image/webp,*/*;q=0.8',
        'Accept-Language: ru-RU,ru;q=0.8,en-US;q=0.5,en;q=0.3',
        'Upgrade-Insecure-Requests: 1',
        'Host: www.osta.ee',
    ]
  ]
];
$context = stream_context_create($options);
$html = file_get_contents($url, false, $context);
?>

Answer the question

In order to leave comments, you need to log in

3 answer(s)
N
Nadim Zakirov, 2022-03-14
@zkrvndm

It opens everything perfectly, I tried it from other countries, it opens everything. If this is not the case for you, you need to use a proxy, apparently the ip address of your site is in the bath.

T
ThunderCat, 2022-03-13
@ThunderCat

open with a browser, look at the headers, transfer to the code.

C
coderisimo, 2022-03-13
@coderisimo

You can try headless browsers

Didn't find what you were looking for?

Ask your question

Ask a Question

731 491 924 answers to any question