A
A
Anton Zhuchkov2021-12-15 00:02:19
Algorithms
Anton Zhuchkov, 2021-12-15 00:02:19

Algorithm for matching two texts?

There are two texts of the same document. It is required to find matching or almost matching fragments. Well, that is, for example, in one text there is a header and comments. But the other one doesn't. But it is necessary to determine and preferably quickly those fragments of two texts that are the same.

It would be especially valuable to find fuzzy matches. For example, one text was obtained as a result of image recognition and in some places it is rather crooked.

Please give direction. What algorithms can be applied, what to read?

Answer the question

In order to leave comments, you need to log in

1 answer(s)
A
Armenian Radio, 2021-12-15
@gbg

Start with diff, then docdiff. The latter diffs Word files pretty well.
I forgot the main thing! Dissertation plagiarism detector !

Didn't find what you were looking for?

Ask your question

Ask a Question

731 491 924 answers to any question