V
V
Vitaly Stolyarov2017-05-02 13:25:09
Programming
Vitaly Stolyarov, 2017-05-02 13:25:09

An efficient algorithm for compressing an array of floats and ints?

I tried LZS:
for a random array of bytes, I got compression by almost one and a half times. For example, 6000 bytes compressed into 4200
for an array of points in three-dimensional space in float32 with random positions compressed by 1.7 times, especially with a uniform arrangement of points (in the form of a regular grid) showed a compression of more than 6 times.
The question is: are there better compression algorithms for a float array, and are there any other hacks to reduce the size of an array without significant data loss?
So far, this option is in my thoughts: float can be translated into half, and since the larger the number, the less accuracy, and many points are next to each other, so you can compensate for this in this way:
the position of each point, starting from the second, will be calculated as the sum of our half-float position data with the position of the previous point, which is already converted to float
Also worth adding that this data describes the 3D model (vertex and index buffers), while I'll see what for 3D models have solutions, but I'm not entirely sure that this will work, since such arrays are needed:
vertex buffer - an array of 3 * n float elements, where n is the number of vertices
index buffer - an array of 3 * m int elements (all values less than n), where m is the number of triangles
texture buffer is an array of 6*n float elements
. Another important thing is that the buffers are small (for example, there are no more than 30k float elements per vertex, and most often 10k, which is about 40 kb)

Answer the question

In order to leave comments, you need to log in

2 answer(s)
A
Andrew, 2017-05-02
@OLS

Information is entropy. The more fully you put meta-information about the input stream into the compression algorithm (statistics, centers of mass, correlations between neighboring values, correlations in packets, etc.), the better the average compression ratio for large volumes will be. Tell us about your real data?

R
Roman Mirilaczvili, 2017-05-02
@2ord

I recommend listening to podcast #44 by the author of the Akumuli project (database for storing time series, TSDB ). He tells various interesting things about storing data, including storing arrays of numbers.

Didn't find what you were looking for?

Ask your question

Ask a Question

731 491 924 answers to any question