Answer the question
In order to leave comments, you need to log in
Algorithm for finding numbers without pairs from a large data stream?
Good afternoon!
Today the KPI-Open Olympiad was held and there was an interesting task:
Limits: 8 sec. 8Mb
Condition
There are N natural numbers (N is paired from 2 to 10,000,000) modulo no more than 1,000,000,000. Almost all of these numbers have a pair. But two numbers - without a pair. It is necessary to find those same two numbers without a pair.
I pay attention to restrictions.
Example
Input
6
4 8 4 7 9 9
Answer
7 8
Example2
Input
8
7 7 7 5 5 5 5 6
Answer
6 7
Answer the question
In order to leave comments, you need to log in
uint32_t xor_all = 0;
uint32_t xor_bit[32] = {0};
....
for (i = 0; i < n; ++i) {
xor_all ^= in;
for (j = 0; j < 32; ++j) {
if (in & (1 << j))
xor_bit[j] ^= in;
}
}
for (j = 0; j < 32; ++j) {
if (xor_all & (1 << j)) {
out1 = xor_bit[j];
out2 = xor_all ^ xor_bit[j];
break;
}
}
I would use a set.
The algorithm is as follows: for each number, we check whether it is in the set. If not, add, otherwise remove. This check goes for O(log n) for std::set in C++, or even for a constant for hash containers. (I don't know which language can be used according to the problem statement).
The only problem is if the numbers and their pairs are not located pseudo-randomly, but for example, first all the unique ones, and then all their pairs. In this case, when passing the first half of the array, the size of the set will only increase and we can fly with a memory limit.
+1, posted the same thing. Depending on the compiler, C++ can use either C++11 containers (unordered_set) or from the tr1 namespace on older compilers, if present. In fact, when solving it head-on, you will have to store at least half of the values, or store it somehow more cunningly.
I thought a little: the memory problem can be circumvented if it is allowed to bypass the numbers not sequentially, but in random order.
You can also use the "divide and conquer" principle, i.e. build, as described above, sets of elements without a pair for small pieces of the original array, and then look for a symmetric difference (union without intersection) of these sets.
We will need to count this array 2 or 3 times. First, we will store these numbers modulo 10^5. Let's get a bool-array of 10^5 elements - how many times each module was encountered. Consider 2 options
1) The required numbers modulo 10^5 are different. Then we will have 2 non-zero cells. We know the modules of the desired numbers by 10^5. The task was reduced to the following - to find from the array of numbers one number that does not have a pair. Solved by xor. We read the array again and calculate 2 xor for the found modules.
2) The desired numbers are modulo 10^5 (all cells are zero). Then they are not equal modulo 10^5+1.
Didn't find what you were looking for?
Ask your questionAsk a Question
731 491 924 answers to any question