How to find the set of rows and columns in the data that maximizes the sum of the cells?

S

Sergey Sokolov2017-02-01 21:20:48

Algorithms

Sergey Sokolov, 2017-02-01 21:20:48

Data - m rows of n coefficients.
Find in this data such a set of columns and rows that would maximize the sum of the coefficients that fall into this intersection.
The answer will be the numbers of the selected columns and rows.
Please tell me the algorithm (at least the name where to dig).
Simplified case
Coefficients take values of 0 or 1.
Find a set of columns and rows that contains only ones.
Let's say the number of columns is k = 2 times more significant than the number of rows. Those. 6 columns and 4 rows (6*2 + 4) is better than 4 columns and 6 rows (4*2 + 6). Example. Here the matrix describes a graph of bidirectional links between some objects 0..5. Therefore, the symmetry is rel. diagonals:

...0 1 2 3 4 5
0 [1,1,1,0,0,0]
1 [1,1,0,1,1,0]
2 [1,0,1,0,0,0]
3 [0,1,0,1,1,0]
4 [0,1,0,1,1,0]
5 [0,0,0,0,0,1]

Here the clique in columns == rows [0,1] and larger - in [1,3,4] - that's the last one was the desired one.
The real option is more complicated.
A cell in each column can have only one of two values - this is a negative or positive number: - "reward" for the presence of a feature, or "penalty" for its absence. They are asymmetric with respect to zero: the "reward" is small, the "penalty" is large. Same within the same column. Different in different columns. The coefficient k will also be varied to obtain different results.
Does it look like some typical task from machine learning and prototyping in Matlab/Octave or Python?