1
1
10farid102020-12-14 15:50:03
Python
10farid10, 2020-12-14 15:50:03

How to parse text logs in python?

It is necessary to analyze text logs containing information about logins and logouts of system users.
The login and logout entries look like this:
10:08:54 AM Login successful from user Sergey Pavlov id_session=2710
10:35:33 PM Closing session (user $escalation$, reserved) id_session=2710 reason=1,20

By tag “Login successful” we understand that the user Sergey Pavlov has logged in, and we get the session number id_session=2710 . Next, we look for the same session number by the “Closing session” tag - we understand that the user has logged out.
I need to find out how long it has been active by comparing the session number. Roughly speaking, I need to get the id, full name and time. If you throw off useful links for solving this problem, I will be glad. Thanks

Answer the question

In order to leave comments, you need to log in

1 answer(s)
S
soremix, 2020-12-14
@SoreMix

sessions.txt
trashstring
trashstring
trashstring
10:08:54 AM Login successful from user Sergey Pavlov id_session=2710
trashstring
trashstring
trashstring
trashstring
trashstring
trashstring
trashstring
trashstring
trashstring
trashstring
10:35:33 PM Closing session (user $escalation$, reserved) id_session=2710 reason=1,20
trashstring
trashstring
trashstring
trashstring
trashstring
trashstring
trashstring

import re
from datetime import datetime


sessions = {}


with open('sessions.txt', 'r', encoding='utf-8') as f:
    log = f.readlines()


for line in log:

    if 'Login successful' in line:
        chunks = re.search(r'(.+?) Login successful from user (.+?) id_session=(.+?)$', line)

        # Возможно придется подкорректировать преобразование времени, если я не угадал с форматом 
        login_time = datetime.strptime(chunks.group(1), '%I:%M:%S %p')

        sessions[chunks.group(3)] = {'login_time': login_time, 'username': chunks.group(2)}

    elif 'Closing session' in line:
        chunks = re.search(r'(.+?) Closing session.+id_session=(.+?)\sreason=(.+)$', line)

        logout_time = datetime.strptime(chunks.group(1), '%I:%M:%S %p')
        session_id = chunks.group(2)

        if session_id not in sessions:
            print('Сессия {} закрыта, нет данных о входе'.format(session_id))

        else:
            login_time = sessions[session_id]['login_time']
            username = sessions[session_id]['username']
            session_time = logout_time - login_time

            print('Пользователь {} завершил сессию спустя {}, причина: {}'.format(username, session_time, chunks.group(3)))

            del sessions[session_id]

Didn't find what you were looking for?

Ask your question

Ask a Question

731 491 924 answers to any question