Answer the question
In order to leave comments, you need to log in
How to compare two arrays and leave only matching data without loading the arrays into memory?
Hello
There are two integer arrays - in each, let's say, a billion different sorted values (to lie in the form of binary files on disk). How can you compare these two arrays and leave only those values that match in both arrays, without loading the arrays themselves into memory, but using only disk?
Thank you.
Answer the question
In order to leave comments, you need to log in
If the values are really different, sorted in ascending order and they are 4 bytes in size:
var stream1 = new BinaryReader(File.OpenRead("someFile1.bin"));
var stream2 = new BinaryReader(File.OpenRead("someFile2.bin"));
var output = new BinaryWriter(File.OpenWrite("output.bin"));
if (stream1.PeekChar() != -1 && stream2.PeekChar() != -1)
{
var val1 = stream1.ReadInt32();
var val2 = stream2.ReadInt32();
while (true)
{
if (val1 == val2)
{
output.Write(val1);
if (stream1.PeekChar() == -1 || stream2.PeekChar() == -1) break;
val1 = stream1.ReadInt32();
val2 = stream2.ReadInt32();
}
else if (val1 < val2)
{
if (stream1.PeekChar() == -1) break;
val1 = stream1.ReadInt32();
}
else
{
if (stream2.PeekChar() == -1) break;
val2 = stream2.ReadInt32();
}
}
}
stream1.Close();
stream2.Close();
output.Close();
Ha. Ha ha. Ha ha ha. And how, excuse me, will they compare? On the logical diagram of the disk? Well, you can read four bytes each (if we are talking about int), refer to it as an int, compare and write if they are the same, moving on, otherwise we leave the largest.
Didn't find what you were looking for?
Ask your questionAsk a Question
731 491 924 answers to any question