Where does text.ru get data for checking for plagiarism?

S

sokolnikov2017-07-24 12:49:06

API

sokolnikov, 2017-07-24 12:49:06

Hello. Who has any guesses where text.ru takes the data to check for plagiarism? They have some faster source of data than SERPs.
For example, I added unique text content on one of the sites, and in just a minute it (the content) was already detected and analyzed by the text.ru algorithm. And the appearance of this content in the search results of Yandex and Google still has to wait more than one week.

Reply

Answer the question

In order to leave comments, you need to log in

2 answer(s)

Y

Yuri Esin, 2017-07-24
@Exomode

Most likely, there is a classic "cumulative" bigdata approach. Asynchronously in the background, data is parsed from the network, this allows you to always keep the data up-to-date and dynamically replenish it. Then metadata is formed for quick analysis, they are already stored in the service database. Then, when you have already directly entered the text and sent it for validation, the comparisons are analyzed using fuzzy search or other optimized algorithms for working with text, metadata are compared and the result is returned. Of course, I can be wrong, but if I needed to implement such a solution, then the principle of operation would be similar to the one described above.

D

Dimonchik, 2017-07-24
@dimonchik2013

ha ha, everything unknown to us seems wonderful
there is no secret: search engines

the appearance of this content in the search results of Yandex and Google still have to wait more than one week.

, and in duckduckgo you don’t have to wait
, of course, text.ru keeps its nose to the wind and grazes the rest of the exchanges (quickly laid out, quickly lost (c)), but there are no miracles , there is just something unusual for you
, this is obvious when checking the text, for example, from different IP