A
A
Aljo2021-11-09 20:30:43
PHP
Aljo, 2021-11-09 20:30:43

Why is the page with a get request parsed incorrectly?

Hello!
There is a page of a third party site that has a search form with a get method.
If you simply enter the address of the page with the necessary get-parameters in the browser, then the page with the results will immediately come out.

But for some reason, when I try to parse the exact same url using simple_html_dom, I get a page that instead of results shows a message that nothing was found.

For what reason could this be? url is 100% correct, I copied it from the script, pasted it into the browser line and it shows all the rules.

Answer the question

In order to leave comments, you need to log in

1 answer(s)
N
Nadim Zakirov, 2021-11-09
@aljo222

When loading the html code, you need to imitate the browser, namely, at least pass the User-Agent, and as a maximum all the headers that the browser usually sends.
An example of loading a page with a simulated browser:

spoiler
<?php

// Указываем тип документа и кодировку:
header('Content-Type: text/html; charset=utf-8');

// Включаем отображение ошибок:

ini_set('error_reporting', E_ALL);
ini_set('display_errors', 1);
ini_set('display_startup_errors', 1);

// Адрес для парсинга:
$url = 'https://w8shipping.com/tracking/?vin=3GNAXJEV5JS538785&searchAuto=Search';

// Создаём новый сеанс:
$curl = curl_init();

// Указываем адрес целевой страницы:
curl_setopt($curl, CURLOPT_URL, $url);

// О отключаем проверку SSL сертификата:
curl_setopt($curl, CURLOPT_SSL_VERIFYPEER, 0);
curl_setopt($curl, CURLOPT_SSL_VERIFYHOST, 0);

// Устанавливаем заголовки для имитации браузера:

$headers = [
  'Accept: text/html,application/xhtml+xml,application/xml;q=0.9,image/avif,image/webp,image/apng,*/*;q=0.8,application/signed-exchange;v=b3;q=0.9',
  'Accept-Encoding: gzip, deflate, br',
  'Accept-Language: ru-RU,ru;q=0.9',
  'Connection: keep-alive',
  'DNT: 1',
  'Host: ' . parse_url($url)['host'],
  'sec-ch-ua: "Chromium";v="94", ";Not A Brand";v="99"',
  'sec-ch-ua-mobile: ?0',
  'sec-ch-ua-platform: "Windows"',
  'Sec-Fetch-Dest: document',
  'Sec-Fetch-Mode: navigate',
  'Sec-Fetch-Site: none',
  'Sec-Fetch-User: ?1',
  'Upgrade-Insecure-Requests: 1',
  'User-Agent: Mozilla/5.0 (Windows NT 10.0; Win64; x64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/94.0.4606.114 Safari/537.36'
];

curl_setopt($curl, CURLOPT_HTTPHEADER, $headers);

// Разрешаем переадресацию:
curl_setopt($curl, CURLOPT_FOLLOWLOCATION, 1);

// Запрещаем прямой вывод результата запроса:
curl_setopt($curl, CURLOPT_RETURNTRANSFER, 1);

// Делаем сам запрос:
$result = curl_exec($curl);

// Завершаем сеанс:
curl_close($curl);

// Смотрим результат:
echo $result;

Didn't find what you were looking for?

Ask your question

Ask a Question

731 491 924 answers to any question