Answer the question
In order to leave comments, you need to log in
How to find and display duplicate lines in a text file?
There are two text files containing lines: file_0.txt and file_1.txt. The number of lines may vary. The length of the lines can be different. The files contain a large number of lines. You need to efficiently output to another file lines that are contained in two files at the same time.
Example:
The contents of the file file_0.txt:
j43j72h531
b2x891ow52
rr35986z77
x77jm9lp7g
q0pprcp52yawc10
wh3h476m2u
e7h0cv6rh5
5l7i700939
l3ri0p8p2f
l1h14no300
l1h14no300
j2615a2e0y
815555v33h
q0pprcp52yawc10
2vhhh0ugxv
rc2jl8lhdl
79qn640321
b2x891ow52
b2x891ow52
q0pprcp52yawc10
l1h14no300
Answer the question
In order to leave comments, you need to log in
It looks like something like this (almost in c#) will help you:
$c = [string[]](Get-Content .\0.txt)
$sk1 = [System.Collections.Generic.HashSet[string]]::new($c)
$c = [string[]](Get-Content .\1.txt)
$sk2 = [System.Collections.Generic.HashSet[string]]::new($c)
$sk1.IntersectWith($sk2)
$sk1
$f0 = Get-Content -Path C:\tmp\file_0.txt
$f1 = Get-Content -Path C:\tmp\file_1.txt
[system.linq.enumerable]::Intersect([object[]]$f0, [object[]]$f1) | Out-File -FilePath C:\tmp\file_2.txt
If I were you, I would try to figure out what kind of lines these are, which are not included in the result now. Maybe they're not exactly the same.
Findstr is asking here. Moreover, it suits the speed.
Didn't find what you were looking for?
Ask your questionAsk a Question
731 491 924 answers to any question