Answer the question
In order to leave comments, you need to log in
How to remove lines from a large file that are in another file?
The task is to remove lines from a file (1.txt) weighing 10 gigabytes that are in another file (2.txt) weighing 5 gigabytes. The computer has 8 gigabytes of RAM.
Both files are sorted, UTF-8 encoded without BOM, and do not contain a carriage return (\x0D or \r). From file 2.txt, almost all lines are present in file 1.txt
Please tell me if there are ready-made solutions for this task? Preferably under Windows, but also under Linux.
I tried 3 solutions:
The first under Linux, the comm command, I run it with the parameters: comm -23 1.txt 2.txt > out.txt and at the output I get all the lines from the 1.txt file as is, without deleting lines contained in file 2.txt
If you do it through grep -vf 2.txt 1.txt > out.txt, then it falls off due to lack of RAM.
The second solution is the windows utility findstr launched with the parameters:
FINDSTR /V /G:C:\2.txt C:\1.txt > C:\new.txt
works for a long time, somewhere for a couple of days, then it still falls into an error.
The third solution, through the TextPipe Pro software, this one works in a strange way, I specify the file with which to work, in the filters I choose to delete lines, I specify the file from which to take lines, as a result, the selected file is ignored, as in the case of the Linux comm utility, I get the original 1.txt file is unchanged.
At the same time, with small amounts of data, TextPipe Pro works as it should, with slightly larger ones, for example, sort lines from a gigabyte that are in a half-gigabyte, here it gives an error of lack of RAM, and with 10 gigabytes it completely ignores the operation, making it appear that it performs.
I would be very grateful if someone could suggest some solutions. I'm not a programmer, please do not offer options to write a script (bash, python), or drive everything into the database. Also, a wish not to offer options that will delete lines from a 10 gig file for more than a week.
Answer the question
In order to leave comments, you need to log in
Didn't find what you were looking for?
Ask your questionAsk a Question
731 491 924 answers to any question