Y
Y
Yaroslav2018-09-11 15:42:07
Algorithms
Yaroslav, 2018-09-11 15:42:07

How to check data integrity remotely (on an untrusted host)?

Imagine a p2p service where people trust each other to store data. (For example, my photo album is stored by 3 more people on the network, and I also store someone else's data for this. If my disk burns out, the data can be restored from their copies).
Recovery is very rare. Therefore, there is a risk that the remote node will "lie". Will accept data for storage and simply "save to /dev/null". Therefore, there is a need to somehow periodically check them. The easiest method to check is to periodically download and compare them, but this is very expensive and only works if you have a complete copy. (For example, a third-party service cannot do this automatically).
Is there any more efficient algorithm? For example, right away I see such a scheme - before returning, we create a thousand random strings (prefixes), and calculate the hash from the prefix + data, store them locally (this takes up little space). Then we periodically ask the remote side to count them too (we give it a substring) and compare. Without the complete file, it cannot correctly calculate the hash from prefix+data.
Or just request a random block of several bytes at the specified offset from the beginning of the archive. (And locally we store a number of such blocks for verification).
Are there any ready-made approaches for this? It seems to me that this should be some kind of typical cryptographic task, like a key exchange task or a hash sum calculation. Maybe even somewhere in practice something similar is used?

Answer the question

In order to leave comments, you need to log in

2 answer(s)
U
uvelichitel, 2018-09-11
@uvelichitel

Blockchain Uses Merkle Tree - Featured Article

M
Michael Galyuk, 2018-09-27
@robux

As a variation of requesting a random block by a random offset, you can request not the block itself, but its hash. Something like: "Hey dude, give md5 blocks of length 512 at offset 23121 of file so-and-so", and the node responds.
If you yourself do not have this piece, you can ask the second node to do the same and compare. If it converges, then calm down for a while, if it disperses, then either request the third node, or start chasing the first two nodes in their pieces until one "burns out".

Didn't find what you were looking for?

Ask your question

Ask a Question

731 491 924 answers to any question