How to parse images from Yandex images?

T

tank0072020-06-30 22:35:13

Parsing

tank007, 2020-06-30 22:35:13

Good afternoon.
I threw the following code, but even I can’t download pictures.

def SaveImageYandex(text):
    URL = 'https://yandex.ru/images/search?text=' + text
    headers = { 'accept':'*/*', 'user-agent':'Mozilla/5.0 (Windows NT 10.0; Win64; x64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/83.0.4103.116 Safari/537.36',}
    response = requests.get(URL, headers=headers)
    soup = BeautifulSoup(response.content, 'html.parser')
    items = soup.findAll('div', class_='serp-item__preview')
    for item in items:
        href = item.find('a', class_='serp-item__link').get('href')
        if href:
            foto_url = 'https://yandex.ru' + href
            response_sec = requests.get(foto_url, headers=headers)
            soup_sec = BeautifulSoup(response_sec.content, 'html.parser')
            items_sec = soup_sec.findAll('div', class_='MMImageContainer')
            for item_sec in items_sec:
                href_sec = item.find('img', class_='MMImage-Preview').get('src')

After searching for pictures, a bunch of previews are displayed, then we click on the desired preview and go to another page, where, as I understand it, you need to look for a link to the picture.
I get a search page.
I get the address for the next transition into the "foto_url" variable.
Then I try to go to the page of the desired preview, but even does not want to.
"items_sec" returns an empty list, even though that div is where the links to the image are located.
I don't understand how to proceed further.
Can you please tell me where to go next?
Well, or maybe someone has a working code for parsing Yandex images? At least until the moment how to get a link to the picture.

Reply

Answer the question

In order to leave comments, you need to log in

2 answer(s)

G

gkots, 2021-07-28
@gkots

I have a similar problem, I'm wondering how did you end up solving it? And please tell me what you ended up doing

H

Hiroshimakado, 2022-02-11
@Hiroshimakado

def get_link_img(url):
    response=requests.get(url,headers={'user_agent':f'{ua}'})
    soap= BeautifulSoup(response.content,"html.parser")
    links=soap.find_all("img",class_="serp-item__thumb justifier__thumb")
    for link in links:
        link = link.get("src")
        linked = "https:"+str(link)
        #writing to a file
        name=random.random()
        p = requests.get(linked)
        out = open(f"Galery\{name}.jpg", "wb")
        out.write(p.content)
        out.close()