S
S
Shing2011-04-20 09:33:07
Sphinx
Shing, 2011-04-20 09:33:07

Algorithm for searching for logical similarity of full name in the database (ardzhenikidze = ordzhonikidze = ...)

There are two sheets of full name, they must be compared and matches found.
But not exact matches, but logical ones.
Both pages may contain typographical errors.

Therefore,
ardzhenikidze \u003d ordzhonikidze \u003d ardzhonikidze \u003d ardzhenikidze \u003d ardzhenikidze
and so on, that is, there can be many spelling options, but logically, with a human reading, it is clear that this is 99% the same surname.

The search involves the surname, name, patronymic.
There may be misspellings everywhere. But the overall logical "code" must converge.

Ordzhenikidze Alexey Vecheslavovich = Ardzhenikidze Alexy Vyacheslavovich

Any ideas on the algorithm?

Answer the question

In order to leave comments, you need to log in

3 answer(s)
E
ertaquo, 2011-04-20
@ertaquo

There were several articles on this topic on Habré:
habrahabr.ru/blogs/algorithm/117063/
habrahabr.ru/blogs/algorithm/115147/
habrahabr.ru/blogs/algorithm/114947/
PHP has built-in functions of the Soundex and Metaphone algorithms, but whether they work with the Russian language - I don’t know.

V
Vsevolod, 2011-04-20
@sevka_fedoroff

I once used this algorithm in Delphi: www.delphikingdom.ru/asp/viewitem.asp?catalogid=722
It is good because it allows you to adjust the sensitivity. And he really worked.
I think it can be easily ported from Pascal to any other language.

P
philpirj, 2011-04-21
@philpirj

postgres + trigram . Caution, foreign language.

Didn't find what you were looking for?

Ask your question

Ask a Question

731 491 924 answers to any question