A
A
Ad3pt2012-11-15 03:00:50
linux
Ad3pt, 2012-11-15 03:00:50

Choosing a CPU for grep

Good day to all.
Such a situation: there is an old server (4 CPU cores) connected to a SCSI (DAS) disk shelf. The disks contain apache/nginx logs. Periodically, grep is run on some part of these logs (usually on a separate directory of 80-120GB). The best way I found to run grep is:

find /logdir/log_5*201211060* |xargs -n 1 -P 4 lzcat|grep -i "some_phrase

" highly dependent on the CPU. It's time to order a new server, and I'm trying to choose a CPU (preferably Intel) that is as productive as possible for this task. Please tell me what characteristics you should pay attention to first of all: frequency, cache, maybe memory frequency or something else.
Many thanks in advance

PS Perhaps there is a better way to run grep. GNU parallel for some reason on my task shows worse results than xargs -P 4

Answer the question

In order to leave comments, you need to log in

2 answer(s)
G
Gribozavr, 2012-11-15
@gribozavr

If the search is cpu-bound, then it might make sense to write a specialized search tool.

A
Alexey Akulovich, 2012-11-15
@AterCattus

Isn't zcat the cpu bound link here? During the search in the TOP of processes by cpu, out of these 4, grep hangs first?
If it is grep, then I join the option of getting rid of -i.

Didn't find what you were looking for?

Ask your question

Ask a Question

731 491 924 answers to any question