How to find time patterns in data?

S

Sergey Sokolov2017-07-06 14:26:16

Machine learning

Sergey Sokolov, 2017-07-06 14:26:16

There is a log consisting of events of only three types: A and B and Y. The record is two columns: timestamp and type ("A", "B" or "Y").
There is a hypothesis that a certain separation in time of events A and B causes event Y.
Based on the data, how to find the most probable pattern of events A and B that causes Y?
For example, it turns out that most often the event Y, among other things, was preceded by events in the following mask:

-50 секунд: А
-35 секунд: Б
-05 секунд: снова А
 00 секунд: происходит событие Й

In the monitored time window 0..–50 s. other A/B events will also occur, but they do not fall into a stable pattern from time to time, and therefore are considered noise and are excluded from consideration. The signal to noise ratio can be very much in favor of the noise. The value of the "window" under consideration is unknown. It is only approximately possible to take from the ceiling that it is "from 1 to 1000 seconds".
A temporary hit is also not from the world of integers, but “approximately” with some accuracy. Those. the pattern can be considered the same if the first time the first event A was at -50.01 seconds, and the second time at -49.52 Relative to the window size of 50 seconds, +-1 second of accuracy is a valid approximation.

Reply

Answer the question

In order to leave comments, you need to log in

1 answer(s)

S

Sergey Sokolov, 2017-07-07
@sergiks

Looks like I found it: Kernel Density Estimation (KDE) - Kernel Density Estimation (kernel smoothing ). Read more . For my task, I need to align the data segments for the Y event and take the largest peaks in KDE for the A and B events. Selecting the width of the window and kernel is a separate issue.