How to ensure redundant integrity?

E

EverOne2017-09-10 19:03:09

Mathematics

EverOne, 2017-09-10 19:03:09

We are trying with colleagues to find a solution to a logical problem, to be honest, there is no certainty that it exists.
So the task:
There are 1000 prisoners, and there are numbers from 1 to 1000, which are assigned their own value. For example:

1 = 0f88Hs
2 = UdD16j
3 = NkP4V4
4 = bRKitt
5 = 3rLqBA
6 = 5gH6LE
7 = 0wrICY
8 = 58CEoM
9 = 1oCn43
...
1000 = Id2401

Each prisoner knows his pair [ number = value ] and the pair of the previous one, at least. But he can also remember the N-th number of random pairs (or non-random ones, by agreement).
In the morning guards will come and take the prisoners away - a random number (up to 75%, that is, there may be 250 people, or maybe 500) random prisoners. For example, to be shot. Number and order unknown.
So, the question is: what is the minimum number of pairs and what algorithm should be used by each prisoner to remember for guaranteed recovery of the sequence of values? How will this algorithm change if 90% of the prisoners are taken away or only 25%? How to evenly redistribute the load if another 1000 are added to the existing 1000, and the number remains the same, just two prisoners will remember the same number?

Reply

Answer the question

In order to leave comments, you need to log in

4 answer(s)

C

chupasaurus, 2017-09-11
@chupasaurus

I will add Deerenaros : The optimal solution is Folded Reed—Solomon Code , which is also suspiciously similar to the condition of the problem. Using this algorithm, each prisoner needs to remember in addition to his O(1/ε) parts of the code, where ε is the ratio of the remaining prisoners to the total, 1/ε rounded up for the necessary redundancy.
How it works - from the source pairs, a Reed-Solomon code of 1000 * (1 + O (1 / ε)) letters is compiled (the size of a letter in bits based on the size of the data assigned to numbers, in the problem it is 6 bits), each concluded in addition to the original pairs with their own number remembers parts of the code in place1000εn ... 1000ε(n+1)-1, n - your number. The minimum sample in the form of prisoners with numbers 1..1000ε will give 1000ε of its pairs, 1000 pieces of code and a pair of 1000, which is more than the minimum required 1000ε+1.
In the original version: ε=¼, O(1/ε)=4, code size 1000*(1+4)=5000 letters, stability of the FRS code is up to n*(1-ε)-1=3999loss, while we lose only 3749. For ε=⅒ they you have to memorize 10 pieces of code, for ε=¾ - one (as well as for ∀ε>1). If the number of prisoners is increased, then let the prisoners with the same numbers memorize different parts of the code: for example, one as already determined, the second with a shift of 10 from their number; in this case, the code can be equally distributed among all, and the number of parts for memorization will be reduced by half.

D

Deerenaros, 2017-09-10
@Deerenaros

Um, so redundant codes for you. Reed-Solomon, for example. Or cascading, they are more flexible.

S

Sergey Sokolov, 2017-09-10
@sergiks

Instead of “prisoners,” it’s easier for me to imagine hard drives, flash drives, or abstract data containers.
To check for stability, we will consider the worst case of intentional removal by evil "peelers" of a combination of containers that brings guaranteed integrity damage.
Each store "own" value + 1 value from the "previous". In fact, it doesn't matter which one, because their sequence does not play any more. In this situation, data is guaranteed to be saved only when 1 single container is deleted. Because By the second, we will certainly choose the one that stores the data of the first in excess.
Delete N containers. The goal is to cleanly delete the data of at least one. Those. we select all those who store redundantly the data of this "chosen one".
The only way to protect yourself is to store redundantly in N+1 containers. Rather, the number of copies of each data unit should be 1 more than potentially deleted. According to the algorithm, without complicating it, you can store in the next N.

X

xmoonlight, 2017-09-11
@xmoonlight

Arrange the entire sequence (1-1000) in a ring.
Let's say we want a margin of safety: 2.
Then, we divide this ring in half and each X remembers the diametrical value:
1 => 1, 500
2 => 2, 501
, etc.
Safety Factor 3:
1 => 1, 333, 666
2 => 2, 334, 667
etc.
To reduce the amount of stored information, you can resort to entropy coding .