S
S
SomeSpecialOne2020-12-15 23:04:31
Python
SomeSpecialOne, 2020-12-15 23:04:31

How can steam identify the parser?

There is a bot that parses the Steam site. Of course, periodically flies to the ban with the code 429 (many requests). Steam retry-after does not give out, you have to guess the time blindly. I connected tor with privoxy and through them I start parsing. If you change the IP for each request, then Steam does not identify it, but it is too time consuming. If you change the IP immediately after the ban, then Steam even under a different IP identifies the parser and bans again, how so?

session = requests.session()
proxies = {"http": "http://127.0.0.1:8118",
           "https": "https://127.0.0.1:8118"}
cookies = {'Steam_Language': "russian"}
headers = {'user-agent': UserAgent().random}

def get_html(url):
    global session, headers
    while True:
        html = session.get(url, cookies=cookies, headers=headers, proxies=proxies)
        time.sleep(random.choice([0.1, 0.2, 0.3, 0.4, 0.5, 0.6, 0.7]))
        while True:
            if html.status_code == 200:
                return html
            elif html.status_code == 400:
                return None
            elif html.status_code == 429:
                CM().new_identity() #  смена айпи
                session = requests.session() #  смена сессии/подключения
                headers = {'user-agent': UserAgent().random} #  смена юзер-агента
                time.sleep(random.randint(6, 14))
            else:
                return None

Answer the question

In order to leave comments, you need to log in

Didn't find what you were looking for?

Ask your question

Ask a Question

731 491 924 answers to any question