Answer the question
In order to leave comments, you need to log in
Parse issuance of Google. What else did I miss?
Hello. Please leave the moral side of the question out of the discussion.
Perhaps someone worked on it ...
There is a task - the parse of the issuance of Google.
The code:
/**
* Получаем html запроса
* @param string $url адрес запроса
* @return string html выдачи
*/
private function getHtml($url)
{
$curl = curl_init();
curl_setopt($curl, CURLOPT_URL, $url);
curl_setopt($curl, CURLOPT_RETURNTRANSFER, true);
curl_setopt($curl, CURLOPT_USERAGENT, 'Mozilla/5.0 (X11; Linux x86_64) AppleWebKit/537.36 (KHTML, like Gecko) Ubuntu Chromium/45.0.2454.101 Chrome/45.0.2454.101 Safari/537.36');
curl_setopt($curl, CURLOPT_FOLLOWLOCATION, true);
$response = curl_exec($curl);
if(curl_getinfo($curl,CURLINFO_HTTP_CODE) !== 200)
{
# Получаем картинку и куки
$imgUrl = phpQuery::newDocument($response)->find("img")->attr("src");
$curlImage = curl_init();
curl_setopt($curlImage, CURLOPT_URL, "https://www.google.ru".$imgUrl);
curl_setopt($curlImage, CURLOPT_USERAGENT, 'Mozilla/5.0 (X11; Linux x86_64) AppleWebKit/537.36 (KHTML, like Gecko) Ubuntu Chromium/45.0.2454.101 Chrome/45.0.2454.101 Safari/537.36');
curl_setopt($curlImage, CURLOPT_RETURNTRANSFER, true);
curl_setopt($curlImage, CURLOPT_COOKIEJAR, __DIR__."/../../html/assets/cookies.txt");
file_put_contents("assets/captcha.jpg", curl_exec($curlImage));
curl_close($curlImage);
# Расшифровываем капчу
$antiCaptcha = new AntiCaptcha;
$antiCaptcha->sendCaptcha();
$captcha = $antiCaptcha->getCaptchaValue();
# Формируем url запроса
$url = "https://ipv4.google.com/sorry/CaptchaRedirect?continue=".urlencode(phpQuery::newDocument($response)->find("[name=\"continue\"]")->attr("value"))
."&id=".urlencode(phpQuery::newDocument($response)->find("[name=\"id\"]")->attr("value"))
."&captcha=".$captcha
."&submit="."Submit";
# Переходим по URL со всеми нужными данными
$curlGoogleAntiCaptcha = curl_init();
curl_setopt($curlGoogleAntiCaptcha, CURLOPT_URL, $url);
curl_setopt($curlGoogleAntiCaptcha, CURLOPT_USERAGENT, 'Mozilla/5.0 (X11; Linux x86_64) AppleWebKit/537.36 (KHTML, like Gecko) Ubuntu Chromium/45.0.2454.101 Chrome/45.0.2454.101 Safari/537.36');
curl_setopt($curlGoogleAntiCaptcha, CURLOPT_RETURNTRANSFER, true);
curl_setopt($curlGoogleAntiCaptcha, CURLOPT_COOKIEFILE, __DIR__."/../../html/assets/cookies.txt");
$result = curl_exec($curlGoogleAntiCaptcha);
// Вот тут почему-то мне опять выдаётся страница с капчей (((
return $result;
}
curl_close($curl);
return $response;
}
Answer the question
In order to leave comments, you need to log in
And why, when search queries are issued in the rss feed, the old fashioned way
https://news.google.com/news?pz=1&cf=all&ned=ru_ru...
Good afternoon.
Literally recently I solved a similar problem))) I see the difference right away in that in the last request to Google (captcha confirmation) I send an empty continue parameter. I remember, I also had problems with its passage with your own symptoms.
I can share an abstract "debug" class that I made for tests and debugging: pastebin.com/Eymi1U1K
And yes, all work with cookies is assigned to curl, so don't be surprised that they are not explicitly in the code.
Thanks everyone, I found my mistake.
joxi.ru/BA00GP6u5GpgAy
That's where the dog is buried... Accordingly, the cookies were saved for *.google.ru, and I sent the captcha to *.google.com
And, Dmitry , this is also possible in PHP.
Didn't find what you were looking for?
Ask your questionAsk a Question
731 491 924 answers to any question