S
S
SymerDiff2021-07-10 22:17:43
Python
SymerDiff, 2021-07-10 22:17:43

How to parse a log in python?

There is a mail server log from which you need to remove: sender, recipient and sending status (successful or error)
There is a regular expression for email search and status:

reg_email = r'(from=<[a-zA-Z0-9_.+-][email protected][a-zA-Z0-9-]+\.[a-zA-Z0-9-.>]+|status=\w{1,20})'


Reading files from the log:
def reader(filename):
    reg_email = r'(from=<[a-zA-Z0-9_.+-][email protected][a-zA-Z0-9-]+\.[a-zA-Z0-9-.>]+|status=\w{1,20})'
    reg_email1 = r'(to=<[a-zA-Z0-9_.+-][email protected][a-zA-Z0-9-]+\.[a-zA-Z0-9-.>]+status=\w{1,20})'
    with open(filename) as f:
        log = f.read()
        list_email = re.findall(reg_email, log)
        list_email1 = re.findall(reg_email1, log)
        return list_email, list_email1

As a result, I have 2 large lists. You need to display everything in csv (email-number of requests-status). How can this problem be solved? Thank you!

Answer the question

In order to leave comments, you need to log in

1 answer(s)
V
Vindicar, 2021-07-11
@SymerDiff

You can use collections.Counter to count . csv
module for generating a csv file. If the task allows, it is not necessary to first load everything into memory, and then unload it into csv - it is better to unload as the lines are found. But to give detailed advice, you need to see an example of the lines you are looking for in the log file. Also, if the expression is used more than 1-2 times, I advise you to use re.compile() followed by the .findall() method instead of just re.findall(), so as not to compile the regular expression on each request.

Didn't find what you were looking for?

Ask your question

Ask a Question

731 491 924 answers to any question