W
W
whoareyoutofuckinglecture2020-06-01 22:08:07
Parsing
whoareyoutofuckinglecture, 2020-06-01 22:08:07

How do sites understand that they were visited by a Selenium bot, and not a real user?

I am writing a Python3/Selenium WebDriver bot to parse a foreign online store.

Somehow the site immediately understands that Selenium is working with it and drives my bot away with pissing rags :(
There are no captchas there. The site just knows that you are not a real user.

Please tell me how this is possible and how to fix it?

Answer the question

In order to leave comments, you need to log in

1 answer(s)
A
Alexey Sundukov, 2020-06-01
@whoareyoutofuckinglecture

There are many options. From simple:

  • by User-Agent
  • by IP address by tracking the number of requests from one address
  • by the public proxies used (many such services explicitly state who they are)
  • etc.

From complex:
  • track mouse movement
  • conduct analytics on typical user behaviors and look for anomalies

If they start to ban right from the very first request, then they slept on something elementary and primitive. Because with complex protection options for collecting analytics, the bot is allowed to walk around the site.

Didn't find what you were looking for?

Ask your question

Ask a Question

731 491 924 answers to any question