Answer the question
In order to leave comments, you need to log in
How to sparse all Steam TP pages?
When trying to parse the following pages of the TP, a problem arises that it parses the same thing all the time.
For example:
https://steamcommunity.com/market/search?q=&catego...
There is a first page, I parsed item names and prices from it.
Now I take the following page
https://steamcommunity.com/market/search?q=&catego...
and when I try to parse, it parses me the same as from the first page.
I tried to pass it through params, but it didn’t work either.
The code itself:
import requests
from bs4 import BeautifulSoup
import math
#ссылка на 1 страницу ТП
URL='https://steamcommunity.com/market/search?q=&category_570_Hero%5B%5D=any&category_570_Slot%5B%5D=any&category_570_Type%5B%5D=any&category_570_Quality%5B%5D=tag_strange&appid=570#p1_popular_desc'
HEADERS={'User-Agent': 'Mozilla/5.0', 'accept': '*/*'}
def get_html(url,params=None):
r=requests.get(url,headers=HEADERS,params=params)
return r
def get_pages_count(html):
soup=BeautifulSoup(html,'html.parser')
pagination=soup.find('span',id='searchResults_total').get_text()
pagination=math.ceil(int(pagination.replace(',',''))/10)
return pagination
def get_content(html):
soup=BeautifulSoup(html,'html.parser')
items=soup.find_all('a',class_='market_listing_row_link')
sk=[]
for item in items:
sk.append({
'name': item.find('span',class_='market_listing_item_name').get_text(),
'link': item.attrs['href'],
'price': item.find('span',class_='market_table_value normal_price').find('span',class_='normal_price').get_text()
})
return sk
def parse():
html=get_html(URL)
tempURL=URL
if html.status_code==200:
sk=[]
pages_count=get_pages_count(html.text)
for page in range(1,pages_count+1):
tempURL=tempURL[0:tempURL.find("#p")]
tempURL+=f'#p{page}_popular_desc'
html=get_html(tempURL)
sk.extend(get_content(html.text))
Answer the question
In order to leave comments, you need to log in
Didn't find what you were looking for?
Ask your questionAsk a Question
731 491 924 answers to any question