Answer the question
In order to leave comments, you need to log in
Why can't I parse page elements with obscure classes?
I want to learn parsing. Already wrote something and even it turned out. After a while, I decided to return to it. I started writing a new parser and learning everything almost from scratch. For example, I wanted to parse information about matches on the betting site parimatch. But when trying to take elements with information, it cannot find them or returns an empty object. Why?
PS I read a lot of things on forms, tried to use selenium, the same story.
import requests
from bs4 import BeautifulSoup
URL = 'https://www.parimatch.ru/'
HEADERS = {
'accept': 'image/webp,image/apng,image/svg+xml,image/*,*/*;q=0.8',
'User-Agent': 'Mozilla/5.0 (Windows NT 10.0; Win64; x64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/89.0.4389.90 Safari/537.36 Edg/89.0.774.63'
}
# Получение html
def get_html(url, params=''):
html = requests.get(url).text
return html # Возврат полученной страницы
# Поиск нужного контента
def get_content(html):
soup = BeautifulSoup(html, "html.parser")
items = soup.find_all('div', {'class': 'QHMOkrbtqvSkGzF6oZD2a'})
print(items)
if __name__ == '__main__':
html = get_html(URL)
get_content(html)
[]
Process finished with exit code 0
Answer the question
In order to leave comments, you need to log in
In general, parsing is not always as simple as it seems. Sites often try to protect themselves from simple parsers in various ways, even minimal ones, while there are also various types of protection.
In general, if you want to see what we get from the post request, just write to a file, so it will be easier to understand where the error is, and so on. For example
import requests
getpost = requests.get('https://www.parimatch.ru/')
with open('log.html', 'w', encoding='utf-8') as f:
f.write(getpost.text)
Didn't find what you were looking for?
Ask your questionAsk a Question
731 491 924 answers to any question