Answer the question
In order to leave comments, you need to log in
Sorting by a set of keys calculated from the element itself. How to minimize the number of calculations and memory consumption?
This is a real challenge that I had when writing a single filter for images. All morning I tried to isolate it from the context of my applied task and clearly formulate it.
It is required to implement a function that takes as input an array of N elements (the type is not important) and an array of M functions that can be calculated from these elements and return integer values. The function to be implemented must return an array of the indexes of the elements (from the input array) in the sorted array, no actual moving of the elements from place to place is required. Sorting goes on a set of keys, which are integer values returned from the corresponding functions. If for two different elements the first keys (that is, the values returned by the first function) are the same, then their order is determined by the second keys (the values returned by the second function), and so on. It makes no sense to calculate the same function twice from the same element. It also makes no sense to calculate which are not necessary for sorting (actually only the first function will be evaluated for all elements anyway). Plus, the memory is not rubber, its consumption should be reduced as much as possible.
I come to the conclusion that in the worst case the memory consumption will be O(N * M), which I really do not like. Who thinks that you can get by with smaller volumes, it will be very interesting to hear ideas. I also don't like the need for frequent calls to dynamic memory allocation.
As for the algorithm itself: some ideas are spinning in my head about building binary trees at each sorting level, then the final horizontal pass through the entire "forest" to form the output array of indices. But lookups/inserts in b-trees require O(log N) and I need O(1) like for linear arrays. But with linear arrays, I can't save memory and end up with N*M in size even for the best case, and I'm sure you can get by with much smaller sizes except in the worst case.
Perhaps I'm overcomplicating things, and there is some simple and obvious algorithm.
Dear colleagues, share your thoughts on the subject.
PS: The language in the tags is conditional.
Answer the question
In order to leave comments, you need to log in
There is a suggestion to cache the values of the key calculation functions in some HashMap<Pair<N,M>, key_value>
. Should work if the functions take longer to compute than fiddling with the cache.
Sample code:
https://gist.github.com/AtomKrieg/6fb6d3d8ae80842e...
I think that the corresponding algorithm in pseudocode can be depicted something like this: Let's
create an array of N elements of type Node and initialize it:
class Node {
int group;
int key;
int index;
}
Node[N] nodes;
for (int i = 0; i < N; i++) {
nodes[i].group = 0;
nodes[i].index = i;
}
for (int j = 0; j < M; j++) {
// для каждой группы, содержащей более одного элемента
// вычисляем значение ключа в каждом элементе этой группы
// сортируем на месте элементы этой группы по этому ключу
// разбиваем группу на подгруппы с одинаковым значением ключа, присваиваем подгруппам уникальные номера
// если все группы содержат ровно по одному элементу, то досрочный выход из цикла
}
switch to Russian. on the main page there is a link View payments on the top right. Next, the Payment Settings and Payee Profile menu will appear on the left.
In Rapid, open an account on your mobile phone, go to the bank with a passport, pay 150 rubles for verifying your account in Rapid. Everything now adsense brings you to rapid and you from rapid to any visa or mastercard (you can even throw it at someone else's).
I'm through Rapida, and from there to the card or Yandex.Money.
In Russia, only through Rapida.
Didn't find what you were looking for?
Ask your questionAsk a Question
731 491 924 answers to any question