Answer the question
In order to leave comments, you need to log in
User Behavior Statistics Archive
Good afternoon
As part of my PhD and research, I am engaged in the detection of anomalous user behavior ( anomaly detection ) of information systems by creating behavior models. The model itself is already there and on our toy-problem (small artificially generated data) it shows a good detection result. But for a full-fledged study, real data is needed, but they are not.
In this regard, the question is, would someone like to share this type of data? In response, I can share both the results of the study and send already published articles about our approach.
Ideally, we need statistics on the behavior of a large number of users of this kind:
id - ideally, just autoincremented value
user_id
sessinon_id
transaction_id
datetime/timestamp (optional)
Where
user_id is a unique user ID
sessinon_id is a session ID of a user's work in the system the sequence of actions in the base also corresponded to the sequence of their commission).
transaction_id is a unique identifier for one of the possible actions in the system, i.e. for example, obtaining a person's profile is one type of transaction, regardless of whose specific profile is requested. Profile update, already different transaction_id...
datetime/timestamp(optional) - Needed in principle for training models with data in the correct sequence, corresponding to their accomplishment in real life.
And the second table
is
user_id
user_role
role (set of roles) of the user within the system. For example, a secretary, an ophthalmologist, a math teacher...
Ideally, it would also be great to have both sets with known correct data, and with data in which anomalous activity is present. For testing and cross-validation… You know, dreaming is not harmful.
If anyone is interested, I will be eternally grateful. And of course, I will share the results of the study
Answer the question
In order to leave comments, you need to log in
If the project is supposed to be commercial, then bring it to a more or less human form (work out the UI and integration with standard systems to a sane state) and offer at first for free to everyone. If testers come running, you will get tired of overclocking. Data for debugging algorithms will flow like a river.
Hello Pavel!
Could you share articles that have already been published? Your topic is very interesting!
Thank you!
Didn't find what you were looking for?
Ask your questionAsk a Question
731 491 924 answers to any question