Answer the question
In order to leave comments, you need to log in
What is wrong with my python parser?
Tried to parse the game details (timeline, ratings) that appear on the site, but it doesn't show the items I was trying to get.
What did I do wrong and what is my mistake?
Here is the code:
#parse
import requests
from bs4 import BeautifulSoup
url = 'https://osu.ppy.sh/users/16873295'
headers = {'user-agent': 'Mozilla/5.0 (Windows NT 10.0; Win64; x64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/81.0.4044.138 Safari/537.36 OPR/68.0.3618.206', 'accept': '*/*'}
def get_html(url, params=None):
r = requests.get(url, headers=headers, params=params)
return r
def get_content(html):
soup = BeautifulSoup(html, 'html.parser')
items = soup.find_all('div', class_='play-detail')
print(items)
def parse():
html = get_html(url)
if html.status_code == 200:
get_content(html.text)
else:
print('Error')
parse()
Answer the question
In order to leave comments, you need to log in
I have repeatedly advised here, take it as a rule, before any parsing, load the page using the script to your disk. Next, open the page in a text editor, and look for the right element with the right class (or id) in the html. If there is, then you can work with requests. Otherwise - Selenium (there is also XHR...).
Here is the code itself:
import requests
headers = {'user-Agent': 'Mozilla/5.0 (X11; Ubuntu; Linux x86_64; rv:72.0) Gecko/20100101 Firefox/72.0',
'Accept':'text/html,application/xhtml+xml,application/xml;q=0.9,image/webp,*/*;q=0.8'
}
url = 'ссылка'
filename = 'index.html'
response = requests.get(url,headers=headers)
if response.status_code == 200:
with open(filename,'w') as file:
file.write(response.text)
else:
print(response)
Didn't find what you were looking for?
Ask your questionAsk a Question
731 491 924 answers to any question