Answer the question
In order to leave comments, you need to log in
I can not understand why the parser does not work?
Sorry for the question. I'm trying to write a simple bs parser that will extract links from the search results, and I'm running into some difficulties. When I parse all links on the page through the following code, I get some result (until Yandex starts sending captcha).
from bs4 import BeautifulSoup
from bs4.dammit import EncodingDetector
import requests
from fake_useragent import UserAgent
UserAgent().chrome
meme_page = 'https://www.yandex.ru/search/?text=%D0%9F%D0%BB%D1%8F%D0%B6%20%D0%B4%D0%BB%D1%8F%20%D0%BD%D1%83%D0%B4%D0%B8%D1%81%D1%82%D0%BE%D0%B2%20%D0%B2%20%D0%BC%D0%BE%D1%81%D0%BA%D0%B2%D0%B5&lr=213/'
response = requests.get(meme_page, headers={'User-Agent': UserAgent().chrome})
html = response.content
soup = BeautifulSoup(html, 'html.parser')
for link in soup.findAll('a', href=True):
print(link['href'])
from bs4 import BeautifulSoup
from bs4.dammit import EncodingDetector
import requests
from fake_useragent import UserAgent
UserAgent().chrome
meme_page = 'https://www.yandex.ru/search/?text=%D0%9F%D0%BB%D1%8F%D0%B6%20%D0%B4%D0%BB%D1%8F%20%D0%BD%D1%83%D0%B4%D0%B8%D1%81%D1%82%D0%BE%D0%B2%20%D0%B2%20%D0%BC%D0%BE%D1%81%D0%BA%D0%B2%D0%B5&lr=213/'
response = requests.get(meme_page, headers={'User-Agent': UserAgent().chrome})
html = response.content
soup = BeautifulSoup(html, 'html.parser')
for link in soup.findAll('a', {'class':'link link_theme_outer path__item i-bem link_js_inited'}, href=True):
print(link['href'])
Answer the question
In order to leave comments, you need to log in
Because the output is generated by JavaScript. Use selenium.
Didn't find what you were looking for?
Ask your questionAsk a Question
731 491 924 answers to any question