Answer the question
In order to leave comments, you need to log in
Search for the most suitable sets from a sequence (too many combinations)
There is a set of fixed length bit sequences (L == 1024). The number of sequences is ~43000.
How to find among them a combination of N (N == 10) such that the number of included bits in the bitwise logical addition of the result is maximum?
For example (L = 4, N = 2)
The answer will be:
The task is complicated by a huge amount of initial data. The number of possible combinations makes me sad.
Is it possible to solve this problem in an acceptable time, or rely on chance and run a random algorithm for a couple of days? upd
Either I'm already stupid, or I don't understand the proposed solutions. Searching for the maximum number of bits will not return a result. I illustrate:
L = 4, N = 2, for the set
1110 - 3
1100 - 2
0100
0010
0110
1000
1100
0110
1100
---- проверяем, какие биты у нас включены
1110
0110 - 2
0001 - 1
if we count the number of bits and sort (take top N), we get
1110
1100
and
we
need
1110
0001
upd2
11100 - 3
11000 - 2
01110 - 2
00010 - 1
00001 - 1
0) m = 11111
1) get the best sequence for m (11111)
result = [11100]
now m = 00011
2) get the best sequence for m(00011)
result = [11100, 01110]
now m = 00001
3) get the best sequence for m(00001)
result = [11100, 01110, 00001]
now m = 00000, finished
getting the answer - [11100, 01110, 00001]
but this is not the only solution (at step 2, 01110, 00010, 00001 are suitable)
if the amount of initial data is large, then there will be many "best" results. and, of course, not the fact that there is generally a combination that will cover all the bits (I don’t need this). in nature, I will get a bunch of results (if I choose random among the “best” options). from these “best” ones, in the end, I will choose the most suitable one for me (I will sort by the number of covered bits).
Answer the question
In order to leave comments, you need to log in
In upd2, the correct algorithm, you can choose from the “best” the first one that comes across, this absolutely does not affect the resulting result. (If I understand correctly, are you now considering all options with different "best" choices at each step?)
What's the problem with counting the bits in a 1024 bit sequence? Where do you store them? I think in the end this task will be reduced to 1024 shifts. Load them into 32 longs and iterate through the array.
long[][] myArray = new long[43000][32]; int[] bitsCount = new int[43000]; //load our sequence into 32 longs. 32*4*8 = 1024 bits <here is some code> //determine the number of bits set for(int i = 0;i<43000;i++) //loop through all sequences for (int j = 0;j<32;j++) //loop through all longs for(int k = 0;k<32;k++) //loop over all long bits if((myArray[i][j] >> k) & 0x1) bitCount[i]++; //next, sort the bitsCount array and take the N top ones. Or we optimize the sorting so that it does not completely sort, but only looks for the top N //??? //PROFIT!
You only need to read a 5 megabyte file, reading in chunks of 10 kb.
You need to read from each position (from 1 bit to 1024, 2 to 1025, etc.), but if some piece gives a really bad result, you can skip K bits at once, where K is the difference between the result and the top 10 result at the moment (to don't miss anything).
In my opinion, it will turn out pretty quickly, although not at all algorithmically.
I have this vague feeling that this is an NP-complete task. But I can't believe in algebra :(
And why are you discarding the most obvious way:
Take any combination, then try each remaining sequence in turn, whether it can improve the combination (the main parameter is an increase in the coverage area, the side parameter is the number of single bits).
Complexity is linear
Didn't find what you were looking for?
Ask your questionAsk a Question
731 491 924 answers to any question