Answer the question
In order to leave comments, you need to log in
How to determine the similarity (rewriting, uniqueness) of 2 texts in Go (Golang)?
Greetings to all!)
Gentlemen, it is necessary to determine the similarity (rewriting, uniqueness) of two (or more) texts among themselves. Maybe someone faced a similar task, share tips, links to libraries.
P.S. Thanks in advance!
Answer the question
In order to leave comments, you need to log in
The task turned out to be rather non-trivial and there are quite a few cases of solving it, but for those who are interested, you can start digging from here: https://4gophers.ru/articles/semanticheski-analiz-...
There is difflib for python. The code below has not been tested but should work.
from difflib import SequenceMatcher
file_1 = "text_1.txt"
file_2 = "text_2.txt"
s = SequenceMatcher(lambda x: x == " ", # пропускаем пробелы
file_1.read(),
file_2.read())
print(round(s.ratio(), 3)) # число от 0 до 1. 0 - совсем не похожи ; 1 - идентичный текст
Didn't find what you were looking for?
Ask your questionAsk a Question
731 491 924 answers to any question