L
L
lexstile2019-06-03 18:39:46
PHP
lexstile, 2019-06-03 18:39:46

Parsed the wholesaler's product catalog, blocked, what to do?

Sending a request like this:

function request($url){
  $ch = curl_init();
  curl_setopt($ch, CURLOPT_URL, $url);
  curl_setopt($ch, CURLOPT_RETURNTRANSFER, TRUE);
  curl_setopt($ch, CURLOPT_REFERER, $url);
  curl_setopt($ch, CURLOPT_POST, 0);
  curl_setopt($ch, CURLOPT_FOLLOWLOCATION, 1);
  curl_setopt($ch, CURLOPT_USERAGENT, "Mozilla/6.0 (Windows NT 10.0; WOW64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/70.0.3538.67 Safari/537.36");
  $result = curl_exec($ch);
  $info = curl_getinfo($ch);
  if ($info['http_code'] != 200) {
    curl_close($ch);
    return false;
  }
  curl_close($ch);
  return str_get_html($result);
}

Is there any way to bypass the blocking?
At first I thought - by ip, but, it seems, no - I changed it to a new one - it did not help.
I introduced myself as a different browser - it did not help either.
I sent additional headers - it did not help.
--
Any ideas what else to try?)
Thank you in advance for your answer.
PS They give a stub at the next request, which they say is not necessary so)
UPD:
print_r(curl_getinfo($ch)):
<code lang="php">
Array
(
    [url] => https://site.ru/
    [content_type] => text/html; charset=utf-8
    [http_code] => 403
    [header_size] => 308
    [request_size] => 240
    [filetime] => -1
    [ssl_verify_result] => 20
    [redirect_count] => 0
    [total_time] => 0.020305
    [namelookup_time] => 4.4E-5
    [connect_time] => 0.000747
    [pretransfer_time] => 0.007452
    [size_upload] => 0
    [size_download] => 600
    [speed_download] => 30000
    [speed_upload] => 0
    [download_content_length] => -1
    [upload_content_length] => -1
    [starttransfer_time] => 0.020198
    [redirect_time] => 0
    [redirect_url] => 
    [primary_ip] => 36.32.116.75
    [certinfo] => Array
        (
        )

    [primary_port] => 443
    [local_ip] => 31.170.122.143
    [local_port] => 42420
)
</code>

Answer the question

In order to leave comments, you need to log in

2 answer(s)
A
Alexander Pushkarev, 2019-06-03
@lexstile

1) Change CURLOPT_USERAGENT after N requests
2) Maybe the site writes something in cookies, check them too and pass

F
FanatPHP, 2019-06-03
@FanatPHP

Write a letter. Introduce yourself, explain that you have nothing bad in mind, ask for permission to either parse as is, or access to a civilized API.

Didn't find what you were looking for?

Ask your question

Ask a Question

731 491 924 answers to any question