Answer the question
In order to leave comments, you need to log in
Amazon blocks all scrapers?
For several months I used the usual Amazon product parser, PHP + CURL + PROXY, everything worked fine, and today, somewhere in the afternoon, everything broke down, in the sense that the code did not change and everything worked as standard, but Amazon requested any page (at least directly , at least through one of the thousands of proxies) gives out 503 ...
Previously, this was not the case, so that all all answers are 503.
Tell me who came across, I think there are those who also parse Amazon, how to solve this problem? Or is it that they have such a leap from time to time?
After all, it enters normally through the browser, but through curl, here is an example
$ch=curl_init();
curl_setopt($ch,CURLOPT_URL,$url);
$headers=array(
'Host: www.amazon.de',
'User-Agent: '.$user_agent,
'Accept: text/html,application/xhtml+xml,application/xml;q=0.9,*/*;q=0.8',
'Accept-Language: en-US;q=0.5,en;q=0.3',
'Accept-Encoding: gzip, deflate, br',
'Referer: https://www.amazon.de/',
'Connection: keep-alive',
'Upgrade-Insecure-Requests: 1'
);
curl_setopt($ch,CURLOPT_HTTPHEADER,$headers);
curl_setopt($ch,CURLOPT_USERAGENT,$user_agent);
curl_setopt($ch,CURLOPT_TIMEOUT,30);
curl_setopt($ch,CURLOPT_CONNECTTIMEOUT,30);
curl_setopt($ch,CURLOPT_FOLLOWLOCATION,FALSE);
curl_setopt($ch,CURLOPT_RETURNTRANSFER,TRUE);
curl_setopt($ch,CURLOPT_SSL_VERIFYPEER,FALSE);
curl_setopt($ch,CURLOPT_SSL_VERIFYHOST,FALSE);
curl_setopt($ch,CURLOPT_HEADER,TRUE);
curl_setopt($ch,CURLOPT_POST,FALSE);
curl_setopt($ch,CURLOPT_ENCODING,"");
$content=curl_exec($ch);
curl_close($ch);
Answer the question
In order to leave comments, you need to log in
Amazon runs on AWS, AWS has CloudFront CDN, CloudFront has WAF (Web Application Firewall), WAF runs on machine learning. If you don’t buy a proxy from another part of the world and don’t change the request signature beyond recognition, then there’s no way - the machine has already evaluated you, weighed you and smells you a kilometer away
But I don't see a proxy in CURL ...
Put a Tor proxy on the server and add it to CURL
curl_setopt($ch, CURLOPT_PROXY, 'localhost:9050');
curl_setopt($ch, CURLOPT_PROXYTYPE, CURLPROXY_SOCKS5);
Didn't find what you were looking for?
Ask your questionAsk a Question
731 491 924 answers to any question