How to find patterns in application visit time?

S

Sergey Sokolov2017-12-15 22:34:16

Clustering

Sergey Sokolov, 2017-12-15 22:34:16

There is data on the moments when users enter the application during the year. Two columns: user_idand timestamp.
I would like to find patterns that are approximately repeated among large groups of users.
I suppose that there are some similar pictures in the 1-week window (once a day at the beginning of the week, 5 times on weekends) and in the month window (attenuation closer to the salary, a sharp jump after).
How to approach the study of data in this vein?

Reply

Answer the question

In order to leave comments, you need to log in

2 answer(s)

D

Danil, 2017-12-15
@DanilBaibak

One option is time series analysis. For python, there is a good library from Facebook - Prophet . The library works well out of the box, has built-in visualization tools.
On Habré you can find an article about Prophet. I tried to use a library for predicting the fluctuation of a currency pair - an example on github .

S

SeptiM, 2017-12-23
@SeptiM

In general, it sounds like a clustering task. Here you can choose different methods, the main thing is how to set the metric.
Offhand, you can try to determine the distance between users Vasya and Petya as follows. For each visit by Vasya, we look for the nearest visit by Petya and see how they differ in time. You will get a set of times t_1, t_2, ... t_B. We do the same for Petya. Then we calculate the average value as a distance. The difference can be calculated on a looped period (in the week between Sunday and Monday, the distance is 1 one day).
The distance between Vasya and Vasya will be 0. The triangle inequality seems to be satisfied. If Vasya and Petya entered at the same time, the distance between them will be small.
PS The distance can be calculated per line from the number of visits through a linear merge of sorted arrays.