Y
Y
Yuri Denisov2016-08-30 07:22:20
PHP
Yuri Denisov, 2016-08-30 07:22:20

How to send a CURL request to a site with a captcha?

Good afternoon everyone! Help with this issue. There is a site where you need to enter a captcha to request data, of course, the captcha is updated every time the page is updated, the js function that does this is written in the body onload. I need to get a captcha, give it to the user, the user has filled in and send this data to the form. Getting the captcha with curl is no problem, I do it like this:

function request($url,$post = 0){
   $ch = curl_init();
   curl_setopt($ch, CURLOPT_URL, $url );
   curl_setopt($ch, CURLOPT_USERAGENT, 'Mozilla/5.0 (Windows NT 10.0; WOW64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/52.0.2743.116 Safari/537.36');
   curl_setopt($ch, CURLOPT_RETURNTRANSFER, 1); // возвратить то что вернул сервер
   curl_setopt($ch, CURLOPT_FOLLOWLOCATION, 1); // следовать за редиректами
   curl_setopt($ch, CURLOPT_CONNECTTIMEOUT, 30);// таймаут4
   curl_setopt($ch, CURLOPT_SSL_VERIFYPEER, false);
   curl_setopt($ch, CURLOPT_COOKIEJAR, dirname(__FILE__).'/cookie.txt'); // сохранять куки в файл
   curl_setopt($ch, CURLOPT_COOKIEFILE,  dirname(__FILE__).'/cookie.txt');
   curl_setopt($ch, CURLOPT_POST, $post!==0 ); // использовать данные в post
   if($post)
       curl_setopt($ch, CURLOPT_POSTFIELDS, $post);
   $data = curl_exec($ch);
   return $data;
}
$data = request('******');
include 'simple_html_dom.php';
$data = str_get_html($data);
foreach($data->find('img[id=captcha]') as $element) 
   {
      echo "<img src=\"*****/".$element->src."\" /><br>";
} 
$data->clear();
unset($data);

Instead of an asterisk, the site I need, the captcha is displayed, but on the next request, which I do like this:
function request($url,$auth){
   $ch = curl_init();
   curl_setopt($ch, CURLOPT_URL, $url ); // отправляем на
   curl_setopt($ch, CURLOPT_USERAGENT, 'Mozilla/5.0 (Windows NT 10.0; WOW64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/52.0.2743.116 Safari/537.36');
   curl_setopt($ch, CURLOPT_RETURNTRANSFER, 1); // возвратить то что вернул сервер
   curl_setopt($ch, CURLOPT_FOLLOWLOCATION, 1); // следовать за редиректами
   curl_setopt($ch, CURLOPT_CONNECTTIMEOUT, 30);// таймаут4
   curl_setopt($ch, CURLOPT_SSL_VERIFYPEER, false);
   curl_setopt($ch, CURLOPT_COOKIEJAR, dirname(__FILE__).'/cookie.txt'); // сохранять куки в файл
   curl_setopt($ch, CURLOPT_COOKIEFILE,  dirname(__FILE__).'/cookie.txt');
   curl_setopt($ch, CURLOPT_POST, true);
    curl_setopt($ch, CURLOPT_POSTFIELDS, $auth);
   $data = curl_exec($ch);
   curl_close($ch);
   return $data;
}
$auth = "series=$seria&number=$number&answer=$captcha";
$data = request('******',$auth);
include 'simple_html_dom.php';
$data = str_get_html($data);
foreach($data->find('div[id=response]') as $element) 
   {
      echo $element->plaintext."<br>";
   } 
   foreach($data->find('div[id=error]') as $element2) 
   {
      echo $element2->plaintext."<br>";
   }

Accordingly, a new request is made and I already receive an answer that the captcha is incorrect. Is it possible to somehow keep the connection open and send the second request without restarting the page?

Answer the question

In order to leave comments, you need to log in

2 answer(s)
P
PrAw, 2016-08-30
@denissov

The very ideology of the HTTP protocol is that each request is independent. The only crutch that allows you to track the state is a cookie.
The whole sequence of work:
1. download the page with captcha
2. Immediately DOWNLOAD THE CAPTCHA PICTURE - here you have a jamb, because you give the image url, it opens with a different browser - with the user, with different cookies and a different IP address, save it to file
3. send the picture to the user from your local server, get the captcha text from him
4. post the captcha text to the form from item 1
at all stages, we monitor cookies. Wireshark/tcpdump to the rescue, will allow you to look deeper than ever.
The request () function from the first file is more than enough, why crutch the same function of the same name in the second case?

A
Alexander Aksentiev, 2016-08-30
@Sanasol

Use curl_init once.
Then all requests through it, then there will be no data loss between sessions.
This is a crutch option.
or so
curl_setopt($ch, CURLOPT_COOKIESESSION, true);

Didn't find what you were looking for?

Ask your question

Ask a Question

731 491 924 answers to any question