D
D
dom1n1k2016-05-20 13:54:49
JavaScript
dom1n1k, 2016-05-20 13:54:49

Which data structure to choose for a multidimensional sparse matrix?

Omitting some minor details, there is a large sparse 2- or 3-dimensional matrix. How big? Well, the range of indices (for each dimension) is up to several million. The total number of actually occupied elements (for the entire matrix) is thousands, tens of thousands.
On the one hand, it is required to be able to quickly access an element by index (a[i][j][k]). On the other hand, we need the ability to quickly bypass all non-empty elements and do something with them (relatively speaking, a.forEach(...)). Speed ​​is very important.
The matrix element is an object, so typed arrays are obviously excluded.
The question is in the data structure. At the moment I have the following ideas:
1. An array of arrays. The general principle is simplistically demonstrated by the code:

var a = [];

function set (i, j, k, value) {
  if (a[i] === undefined) a[i] = [];
  if (a[i][j] === undefined) a[i][j] = [];
  a[i][j][k] = value;
  return a[i][j][k];
}

function get (i, j, k) {
  return (a[i] === undefined || a[i][j] === undefined) ? undefined : a[i][j][k];
}

Bold minus: it is not clear how to search for and bypass busy elements? Organize a full cycle with stopizziot indexes and checks for undefined? It will search until the global flood.
2. Hash
var a = {};

function set (i, j, k, value) {
  var key = i + '-' + j + '-' + k;
  return (a[key] = value);
}

function get (i, j, k) {
  var key = i + '-' + j + '-' + k;
  return a[key];
}

Minus: Well, probably the lines are slow. Although, as I recently found out, V8 does not actually concatenate strings, but organizes an internal object with links to pieces.
3. A combination of points 1 and 2
That is, keep both data structures in parallel and in each situation refer to the one that is more convenient. Because the final elements are objects, arrays will only contain references. That is, the change of elements through the first structure will be visible through the second.
Minus: an overhead in memory and time to build.
4. Map
https://developer.mozilla.org/en/docs/Web/JavaScri...
Cons: firstly, incomplete cross-browser compatibility, and secondly, this structure does not provide anything particularly new in comparison with a self-written hash.
In Map, you can use objects as keys, but identification is not by content, but by the value of the link! That is, if I added something by the key { i: 1, j: 2, k:-3 }, then you can get it back only by passing the same object. If we form a new triple of indices, albeit numerically equal to the old one, undefined will be returned.
It turns out that you need to make the same compound strings or something else like keys. That is, no profit - well, except for syntactic sugar.
update
I missed a very important point - indexes can be negative too!
And since in JS array indices are only positive, this further reduces the attractiveness of options 1 and 3 (that is, the indices, of course, can be recalculated internally, leading to a non-negative form, but this is an extra hassle and calculations).

Answer the question

In order to leave comments, you need to log in

2 answer(s)
A
Adamos, 2016-05-20
@Adamos

> Array of arrays. Bold minus: it is not clear how to search for and bypass busy elements? Organize a full cycle with stopizziot indexes and checks for undefined? It will search until the global flood.
In fact, the pass
does not produce any undefined, only real-life keys and values ​​will be traversed. No map or hashes are explicitly required here - it's all implemented in the language itself.

#
#algooptimize #bottize, 2016-05-20
@user004

Without changes, you can create a matrix from nested sorted arrays, and they store an index and a reference to an array or value. The value can be an object or an index of an object in a separate list of all objects. I read something about asm js for a long time. Might be useful here. Given the changes, you can have a second matrix in which changes will occur. In the background, you can glue them. It seems there are some background workers. There are two matrices to consider when searching and traversing. Most likely there are ready-made implementations for your task in other languages.

Didn't find what you were looking for?

Ask your question

Ask a Question

731 491 924 answers to any question