M
M
mrgloom2014-05-15 13:06:53
Python
mrgloom, 2014-05-15 13:06:53

How to perform operations with large matrices?

Somewhat general question.
In what form do they work with large matrices?
The word large refers to matrices that do not fit in RAM.
How to store matrices? There is, for example, the HDF5 format (more advanced PyTables for python).
Ideally, I would like the rows of the matrix to be added (that is, it would be something like resize / append)
I know that python and matlab have their own options for memory-mapped file.
R also has its own packages for working with big data.
It is clear that, most likely, it would be possible to choose a separate specialized smart algorithm (out-of-core) for each specific case, but I would like the matrix to be stored on disk and have transparent access to it, like a regular matrix in memory. It turns out some analogy ram<->cache ram<->hdd.
There is also hadoop (mapreduce), but this is already somewhat from a different area.

Answer the question

In order to leave comments, you need to log in

2 answer(s)
A
alec_kalinin, 2015-02-14
@alec_kalinin

An interesting variant of solving the problem in Python is described here:
matthewrocklin.com/blog/work/2015/01/14/Towards-OO...

L
lightcaster, 2014-05-17
@lightcaster

Maybe you are setting the task incorrectly? Usually putting the matrix in memory is not a problem. If there is still a problem, minibatch methods work, or something like online learning, where data can be fed into the model sequentially in small pieces.

Didn't find what you were looking for?

Ask your question

Ask a Question

731 491 924 answers to any question