K
K
khodos_dmitry2021-02-02 18:30:29
PHP
khodos_dmitry, 2021-02-02 18:30:29

How to download image from page in php webdriver?

Let's say I found the src of an image on a page. How can I keep it to myself now? I'm interested in using the webdriver, and not getting the full url and downloading the http request.
Problem: I need to download pictures from a site, but the site is protected by cloudflare. curl fails javascript test. Separately, taking and downloading each image with the help of selenium also does not work, apparently there is a check by referer or something else. You need to open the page and save the pictures from there.
I tried to implement it like this (open the picture in a new tab and download from there):

$crawler = $client->request('GET', $url);

    $client->waitFor('html');

    $imgs = $client->findElements(WebDriverBy::cssSelector('img'));

    foreach ($imgs as $img) {
        $img_url = $img->getAttribute('src');
        $client->executeScript("window.open('$img_url');");
    }

    $windows = $client->getWindowHandles();

    $visitedWindows[] = $client->getWindowHandle();

    foreach ($windows as $window) {
        if (in_array($window, $visitedWindows, true)) {
            continue;
        } else {
            $client->getWebDriver()->switchTo()->window($window);
        }
    }

But I get an error: stale element reference: element is not attached to the page document (Session info: headless chrome=88.0.4324.104)
Despite the fact that the script above finds all the image urls correctly.

Answer the question

In order to leave comments, you need to log in

2 answer(s)
0
0ffff0, 2021-02-02
@0ffff0

curl -Ls -o file.jpg

E
Evgeny Palych, 2021-02-03
@xzKakoyLogin

Using the executeScript method, we create a link on the page in the href of which we place the ObjectURL of the image. Next, we click on our link and the picture in the downloads folder.
js to get the image will be something like this:

async function fetchImage(url){
  const data = await fetch(url, {credentials: 'include'});
  const contentType = data.headers.get('content-type');
  const buffer = await data.arrayBuffer();
  const blob = new Blob([buffer], { type: contentType});
  return blob;
}
async function createLinkImage(url){
    var blob =  await fetchImage(url);
    var link = document.createElement('a');
    link.id = 'saveImageLink'; 
    link.href = URL.createObjectURL(blob);
    link.download = "image.jpg"; //желательно подставлять расширение из contentType картинки
    link.innerHTML = 'скачать image';
    document.body.append(link);
}
createLinkImage('https://site.site/picture.jpg');

To load several images from one page, you need to correct the script to exclude the repetition of an element on the page with the same id

Didn't find what you were looking for?

Ask your question

Ask a Question

731 491 924 answers to any question