A
A
artmart9992021-01-13 11:25:23
Parsing
artmart999, 2021-01-13 11:25:23

Parsing an Instagram page with Guzzle and Didom, how?

Good afternoon dear ones. I want to get data from the given value, from the user's instagram page: 5ffeadc789ecd835613587.png

In this place, instagram provides json information about the user's page. But there is a problem.
The fact is that when you open the Instagram profile page, the first thing that opens for a second or two is the following picture:
5ffeae4605368880396780.png
And there in the source code there is no value I need.
and then the profile page itself opens.
How to make Guzzle wait a couple of seconds when opening a link and parse the opened profile page? Thanks in advance.

Here is my code:

<?php

include "../vendor/autoload.php";

use GuzzleHttp\Client;
use DiDom\Document;

$client = new Client();
$domain = "https://www.instagram.com/instagram/";

$response = $client->get($domain);
$html = (string) $response->getBody();

for ($i = 0; $i <= 31; ++$i) { 
  $html = str_replace(chr($i), '', $html); 
}

$html = str_replace(chr(127), '', $html);

preg_match_all('/<script type="text\/javascript">window\._sharedData = \{(.*)\};<\/script>/', $html, $matches);

$array = json_decode('{' . $matches[1][0] . '}', true);

echo '<pre>';
print_r($array);
echo '</pre>';

Answer the question

In order to leave comments, you need to log in

Didn't find what you were looking for?

Ask your question

Ask a Question

731 491 924 answers to any question