Answer the question
In order to leave comments, you need to log in
Arithmetic compression/encoding?
Hello!
With arithmetic compression/encoding, it is required to encode the "information" message and then represent the message in binary form.
The method does not seem to be complicated, but I have some problems. In all examples that can be found on the Internet, the interval [0; 1) is used. And in all examples, the message fits well into this interval. It doesn't work that way with the "informational" message, since only three characters are repeated in it: these are "n" - 3, "i" - 2, "o" - 2.
How to be in this case?
Answer the question
In order to leave comments, you need to log in
What's the problem?
We determine the alphabet: (a, i, d, m, n, o, p, f, c, s).
We consider symbols (probability distribution).
а | и | й | м | н | о | р | ф | ц | ы
1 | 2 | 1 | 1 | 3 | 2 | 1 | 1 | 1 | 1
а | и | й | м | н | о | р | ф | ц | ы
1 | 3 | 4 | 5 | 8 | 10 | 11 | 12 | 13 | 14
а | и | й | м | н | о | р | ф | ц | ы
0.071 | 0.214 | 0.286 | 0.357 | 0.571 | 0.714 | 0.786 | 0.857 | 0.929 | 1
старт - [0, 1)
и - [0.071, 0.214)
н - [0.122, 0,153)
ф - [0.1465, 0.1486)
...
й - [0.147980532221506, 0.147980532221545)
The interval [0;1) is not a float or a double. And any message will "squeeze" into a long number. Only the compression quality will be bad.
It is possible to compress not by symbols, but by 4 bits, for example. Or even bit by bit. Maybe the result will be better. Or maybe worse.
Didn't find what you were looking for?
Ask your questionAsk a Question
731 491 924 answers to any question