Answer the question
In order to leave comments, you need to log in
Is there any protection against leaking electronic documents in python?
Greetings! Tell me, is there DOC
a PDF
file. The task is to generate images for each user and slightly distort, shift some characters. What if later, if the image of the document pops up somewhere, according to these changes, it would be possible to determine from which user the leak occurred.
I see this solution: We
give each user a unique token. Then we generate a set of random numbers from the token as a salt. We take each document and slightly shift / crop / distort some characters on the received numbers. We save the resulting image and give it to the user.
If you want to determine whose document, then we simply compare some parts of the document with a sliding window, opencv
as far as I remember.
It seems that there are no pitfalls and I’m not the first to come up with this now, so maybe there is already an implementation of something similar already?
UPD: Links to other programs/resources with this feature are also welcome)
Answer the question
In order to leave comments, you need to log in
The idea is sound, and somewhere similar has already slipped. Briefly - in a document, you can change individual characters to similar in spelling ("s" Cyrillic => "s" Latin) (fuu, this will worsen the search in the document) or play with spaces (insert a second space between words). Of course, if the user suspects such DRM, then cleaning it out of the doc is a piece of cake; from pdf is more difficult.
I will add to previous answers. Yes, if you run it through OCR, then ("with" Cyrillic => "with" Latin) will not work, and extra spaces may not work (and even more so small shifts and distortions of characters). But intentional errors in spelling and punctuation can work. If you do not overdo it (numerous errors are striking, and many will not notice the only error on the page).
if the image of the document pops up somewhere
Didn't find what you were looking for?
Ask your questionAsk a Question
731 491 924 answers to any question