V
V
Vincent12022-01-29 16:18:13
Regular Expressions
Vincent1, 2022-01-29 16:18:13

How to exclude a word in a regular expression?

I grep through the log and need to count only the lines that do not contain ".xml", but contain "Googlebot". Tried something like
^.*?(?!xml).*?Googlebot.*?$
this but it doesn't work properly

spoiler
example.com:443 66.249.64.61 - - [29/Jan/2022:04:42:22 +0300] "GET / HTTP/1.0" 200 4433 "-" "Mozilla/5.0 (Linux; Android 6.0.1; Nexus 5X Build/MMB29P) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/97.0.4692.99 Mobile Safari/537.36 (compatible; Googlebot/2.1; +http://www.google.com/bot.html)"
example.com:443 66.249.64.47 - - [29/Jan/2022:04:47:45 +0300] "GET / HTTP/1.0" 200 6232 "-" "Mozilla/5.0 (compatible; Googlebot/2.1; +http://www.google.com/bot.html)"
example.com:443 66.249.64.61 - - [29/Jan/2022:04:50:12 +0300] "GET / HTTP/1.0" 200 4433 "-" "Mozilla/5.0 (compatible; Googlebot/2.1; +http://www.google.com/bot.html)"
66.249.70.74, 66.249.70.74 - - [29/Jan/2022:04:50:29 +0300] "GET /xml/s.xml HTTP/1.0" 200 1305 "-" "Mozilla/5.0 (compatible; Googlebot/2.1; +http://www.google.com/bot.html)"
example.com:443 66.249.70.79 - - [29/Jan/2022:04:56:46 +0300] "GET / HTTP/1.0" 200 5540 "-" "Mozilla/5.0 (compatible; Googlebot/2.1; +http://www.google.com/bot.html)"
66.249.70.74, 66.249.70.74 - - [29/Jan/2022:05:04:15 +0300] "GET /xml/s.xml HTTP/1.0" 200 1291 "-" "Mozilla/5.0 (compatible; Googlebot/2.1; +http://www.google.com/bot.html)"
example.com:443 66.249.64.69 - - [29/Jan/2022:05:04:29 +0300] "GET / HTTP/1.0" 200 6271 "-" "Mozilla/5.0 (Linux; Android 6.0.1; Nexus 5X Build/MMB29P) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/97.0.4692.99 Mobile Safari/537.36 (compatible; Googlebot/2.1; +http://www.google.com/bot.html)"
66.249.70.76, 66.249.70.76 - - [29/Jan/2022:05:06:47 +0300] "GET /xml/s.xml HTTP/1.0" 200 1311 "-" "Mozilla/5.0 (compatible; Googlebot/2.1; +http://www.google.com/bot.html)"
66.249.70.76, 66.249.70.76 - - [29/Jan/2022:05:07:06 +0300] "GET /xml/s.xml HTTP/1.0" 200 1273 "-" "Mozilla/5.0 (compatible; Googlebot/2.1; +http://www.google.com/bot.html)"

Answer the question

In order to leave comments, you need to log in

2 answer(s)
A
Alexandroppolus, 2022-01-29
@Vincent1

^(?!.*?xml).*?Googlebot.*?$

A
Alexey Yarkov, 2022-01-29
@yarkov

Perhaps I misunderstood the
test.log task

Googlebot.html
Googlebot.php
Googlebot.xml

Command
cat test.log | grep Googlebot | grep -v xml
Result
Googlebot.html
Googlebot.php

Didn't find what you were looking for?

Ask your question

Ask a Question

731 491 924 answers to any question