S
S
s1veme2020-07-31 11:20:58
Python
s1veme, 2020-07-31 11:20:58

How to parse steam prices/link?

hello world!

I started parsing steam, a problem arose, as well as a huge misunderstanding.

The code:

from bs4 import BeautifulSoup
import requests

headers = {

    'User-Agent': 'Mozilla/5.0 (X11; Linux x86_64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/84.0.4147.89 Safari/537.36',
    'Accept': 'text/html,application/xhtml+xml,application/xml;q=0.9,image/webp,image/apng,*/*;q=0.8,application/signed-exchange;v=b3;q=0.9',
    'Accept-Language': 'ru-UA,ru;q=0.9,en-US;q=0.8,en;q=0.7,ru-RU;q=0.6',
}
steam_link = ('https://steamcommunity.com/market/listings/730/AK-47%20%7C%20Phantom%20Disruptor%20%28Field-Tested%29')
print(steam_link)
full_page = requests.get(steam_link, headers=headers)
soup = BeautifulSoup(full_page.content, 'html.parser')

skins = soup.find_all('div', class_='market_listing_row')
print(skins)

for skin in skins:
  name = skin.find('span',class_='market_listing_row ').text
  counts = skin.find('span',class_='market_listing_num_listings_qty').text
  price = skin.find('span',class_='sale_price').text.replace('От','').strip() #HACK
  print(name, counts, price)


Link to what I'm parsing:
https://steamcommunity.com/market/listings/730/AK-...

How to parse price + title + 'Buy' link?
Thank you very much for your answer :)

(I decided to master parsing a little, but something got stuck)

Answer the question

In order to leave comments, you need to log in

1 answer(s)
D
Dmitry, 2020-07-31
@aleksegolubev

This block is updated using js, so you need to look at the requests that the browser makes.
Open the developer tools in your browser and look at the Network tab for XHR requests. There will be requests to https://steamcommunity.com/market/itemordershistogram , which return a json with the data you need.

Request URL: https://steamcommunity.com/market/itemordershistogram?country=RU&language=russian&currency=1&item_nameid=176118358&two_factor=0

country, language, currency - unchanged
item_nameid is in the page source, can be obtained using a regular expression:
import re
result = re.findall(r'Market_LoadOrderSpread\(\s*(\d+)\s*\)', str(full_page.content))
print (result[0])

After that, just make a request with requests and parse the json

Didn't find what you were looking for?

Ask your question

Ask a Question

731 491 924 answers to any question