Answer the question
In order to leave comments, you need to log in
How to parse logs with bash?
Good afternoon.
There was a need for a log of the form:
RESULT=xxxxx, TIME=2020-01-20 18:43:12, HOST=xxxxxxxxxxx, NAME=xxxxxxxx
#!/bin/bash
date_prefix=`date -d '1 hour ago' "+ %z %a %b %d %G"`
cat LOG.csv | while read line
do
date_setup=`echo ${line} | grep -o TIME=................... | awk -F" " -v var="${date_prefix}" {'print $2".000" var'}`
echo ${line} | sed "s/TIME=.................../TIME=$date_setup/g" >> LOG_new.csv
done
Answer the question
In order to leave comments, you need to log in
Here such piece will work approximately with a speed of megabytes per second.
Measured like this:
yes "RESULT=xxxxx, TIME=2020-01-20 18:43:12, HOST=xxxxxxxxxxx, NAME=xxxxxxxx" \
| pv \
| py -x "', '.join(['='.join((k, datetime.datetime.strptime(v, '%Y-%m-%d %H:%M:%S').strftime('%H:%M:%S.000 +0700 %a %b %d %Y')) if k == 'TIME' else (k, v)) for k, v in ((kv.split('=') for kv in x.split(', ')))])" \
> /dev/null
At you on each line of a broad gull - some programs/processes are started. And the output file is opened and closed every time. It is not surprising that this perversion slows down.
It should be something like this:
Remove cat. awk does everything.
The "grep -o TIME=..." function can be ported to awk, it has a nice tool for that.
Run "date" - also done in awk, parse the date manually.
Well, or at least remove ">> LOG_new.csv" from the loop - this is perfectly done from the outside, after "done"; in the worst case - it will be necessary to take this case into brackets.
You would give an example of a log, it would be easier. It would be nice to have an explanation.
Didn't find what you were looking for?
Ask your questionAsk a Question
731 491 924 answers to any question