R
R
Roman Mirilaczvili2016-12-29 17:30:26
SQL
Roman Mirilaczvili, 2016-12-29 17:30:26

Which search engine recognizes "red caviar" and "red caviar" as duplicates?

Let's say the mysql/postgresql database has duplicates of the same expressions/terms in the product_name column. How can such duplicates be identified?
Levenshtein distance won't help. What other options are there?

Answer the question

In order to leave comments, you need to log in

3 answer(s)
R
Roman Mirilaczvili, 2016-12-30
@2ord

I sketched a simple implementation in the Ruby language for finding identical strings - duplicate names.
Simple function for fuzzy string match

A
al_gon, 2016-12-29
@al_gon

n-grams and coefficients.
https://en.wikipedia.org/wiki/S%C3%B8rensen%E2%80%...
https://toster.ru/answer?answer_id=908115#comments...
https://en.wikibooks .org/wiki/Algorithm_Implementa...
For those who are wondering how it happens if it is a separate service www.findologic.com/ru/features
They are from Austria and there is a lot of marketing "blablabla" on the page. But there are also interesting moments.

X
xmoonlight, 2016-12-29
@xmoonlight

Such

Didn't find what you were looking for?

Ask your question

Ask a Question

731 491 924 answers to any question