Answer the question
In order to leave comments, you need to log in
Explain which is better AWK or UNIQ on Linux (Removing duplicates from a larger file)?
There is a txt file whose volume is 107GB free on a 109 GB screw.
What is better to use to quickly get rid of duplicate lines in a text file.
I tried the command " "
Everything started beautifully and very quickly, but after 15-17 hours I already saw how it does everything line by line and it really started to blunt the computer.
I look to the side,
but I don’t know how much better it will be than the previous team.
Who can advise what? awk '!seen[$0]++' text.txt
uniq text.txt> text_new.txt
Answer the question
In order to leave comments, you need to log in
do not display the result on the screen and the speed will pleasantly surprise you;)
awk and uniq are about the same in speed
I work with database dumps through sed and awk, a 250 gig text file... a total of 5 minutes after setting the task...
Didn't find what you were looking for?
Ask your questionAsk a Question
731 491 924 answers to any question