Answer the question
In order to leave comments, you need to log in
Find median of two sorted arrays?
A curious problem fell into my hands yesterday. Two arrays are given, sorted in advance in ascending order, with a volume of terabytes (which implies the fact that they cannot be merged into one common array and re-sorted). It is required to find the median (the number that is in the middle of these arrays, we connect them together. For example, the arrays [1, 3, 5, 7] and [2, 6, 10] have a median of 5, visually connecting, we get: [1, 2, 3, 5 , 6, 7, 10], i.e., this is not the arithmetic mean.), which will lie in the middle of these two arrays, moreover, sorted relative to each other. I am looking for solutions in C ++, it would be interesting to look at your ideas, because personally I, so far, have not completed
______________________
Already doper) I didn’t see adequate solutions at CPP, but I saw a couple of great ideas and simplified them as much as possible, in the near future I’ll throw the solution on the pros and throw it here. Thank you all so much!
Answer the question
In order to leave comments, you need to log in
What's the problem? The lengths of the arrays are known, we take half the sum of the lengths and shift in parallel along both arrays by this amount, similar to merge sort.
Variant on JS, what happened right off the bat.
arr1 = [1, 3, 5, 7];
arr2 = [2, 6, 10];
p1 = arr1.length;
p2 = arr2.length;
n = p1+p2;
if (n == 0)
n--;
med = 0;
p1--;
p2--;
while (0 < n) {
if (p2 < 0 || (p1 >= 0 && arr1[p1] > arr2[p2]))
med = arr1[p1--];
else
med = arr2[p2--];
n -= 2;
}
if (0 == n) {
if (p2 < 0 || (p1 >= 0 && arr1[p1] > arr2[p2]))
med = med+arr1[p1];
else
med = med+arr2[p2];
med /= 2;
}
console.log(med);
In such a volume of data, there is probably a very large number of duplicate values.
You can try to collect some statistics:
value - the number
of each value.
Sort and look for the median of statistical values.
The volume, apparently, will still be large, but no longer terabytes.
Or, not to collect statistics, but to count the number of values while moving along both arrays. On which element you reach the median position - this value will be the median.
If the values in the array are of the same length (for example, 32-bit integers), then based on the total volume of the arrays, it is easy to calculate the position of the median element. Also, using the fact that the arrays are sorted, you can quickly find the amount of each specific value.
Didn't find what you were looking for?
Ask your questionAsk a Question
731 491 924 answers to any question