D
D
Dima Barsukov2014-08-14 20:51:37
Bots
Dima Barsukov, 2014-08-14 20:51:37

Is there a decent database/software to detect web bots?

There is a task to calculate automated clicks on the site. Now there are up to several thousand clicks from bots a day, and this is a problem. There is an option to group requests by ip + uagent and blacklist the ip when certain thresholds are exceeded. But would you like some more centralized solution?
Some of the bots are good and introduce themselves. You can work with them using the UAParser database, for example, and the user-agent-info database. But there are about a third of such good bots.
The rest appear to be normal people. But when a couple of thousand requests per day come from one ip and one UA, suspicions creep in)))
Also, there was an idea to use cookie or js mechanisms to calculate bots, but figs there, they do it, not all of course.
I would like a combined solution. At the first line, we will put UAParser to calculate "good" bots. And on the second one, you need either an actively updated api blacklist, or some kind of self-learning software that will track down bots by behavior and enter them into the database of bad guys.
I would be grateful for any tips, including articles on the topic of calculating bots.

Answer the question

In order to leave comments, you need to log in

2 answer(s)
T
throughtheether, 2014-08-14
@throughtheether

I would be grateful for any tips, including articles on the topic of calculating bots.
I doubt that anyone will share a ready-made solution, since, in my opinion, the cost of developing such a solution is quite high. There are useful but rather general articles on the topic on the incapsula blog, for example.

A
Alexander, 2014-08-15
@aspetek

I also somehow asked a similar question: How to solve the problem of cutting off bots in statistics?
My situation was complicated by the fact that the user-agent is unknown, only ip.
Having made 2 scenarios, we managed to filter out most of the bots, but, as it seems to me, we got a large proportion of false positives. In general, while this topic decided to postpone.

Didn't find what you were looking for?

Ask your question

Ask a Question

731 491 924 answers to any question