L
L
lumaxy2017-09-14 18:57:27
data mining
lumaxy, 2017-09-14 18:57:27

What is the best way to implement event log parsing (operations in the customer service system)?

There is a fairly detailed log (log) of events from a certain application for servicing the organization's clients (roughly speaking, a client came - the operator served him and reflected the operation in the application). The log itself is well structured - the date / time of the event, the user of the system, the description of the event, etc. are easily extracted from each entry. The description contains a variety of text from "user such-and-such entered the system" to "the user performed such-and-such an operation with such-and-such parameters, the result of the operation is such-and-such", and the description of the operation itself does not have any uniform format, since in fact, there are several subsystems with different functionality that write the log to one file (a note about the subsystem is also present in the log). According to preliminary estimates, the journal stores about several hundred different types of records.
From the user's point of view, one of his actions (for example, "payment for a service") in the system generates one or several separate entries in the log in the log (for example, "login to the subsystem", "selection of an object", "operation attempt", "confirmation" , "result"). For a specific action, the set of operations and their sequence is not 100% fixed, slight variations are possible. Now my task is to recognize the actions of users in the system in the log, determine their duration, and assign the recognized action to one of the groups of actions with the same / similar sequence of entries in the log.
Specifications:
1. For each user action (and there are a lot of options for actions, and I don’t even have an exhaustive list of them now), the set of entries in the log has some variability, for example, the operator made a mistake during the operation, or the client refused. But at the first stage, variability can be neglected.
2. I gave an example of a record set of one user action just to understand the structure of the log, in fact, different user actions may have record sets that are not at all similar to my example, and they are not known to me in advance. Accordingly, it is necessary to somehow determine in an automated way the finite set of repeating sequences of entries found in the log.
I ask for practical advice - from which side to approach this problem, and describe your vision of the solution algorithm.

Answer the question

In order to leave comments, you need to log in

1 answer(s)
A
Alexey Epsilon, 2017-09-21
@Epsiloncool

Import to MySQL and queries. Not an option?

Didn't find what you were looking for?

Ask your question

Ask a Question

731 491 924 answers to any question