Answer the question
In order to leave comments, you need to log in
Search and sort by % string similarity?
Doctrine ORM + pgSql environment.
Example of user search by name
Test data
Rodger
Sauron
Ron
Rin
Ro
LIKE %r%
ORDER BY user.name
Rin
Rodger
Sauron
Ro
Ron
WHERE user.name LIKE %searchInput% ORDER BY user.name DESC
Ro // потому что: 1 - начинается на R + 2 - R это 50% схожесть с searchInput который = 'R'
Rin // идет вторым потому что I перед O в алфавите и схожесть с R, 33%
Ron
// Rodger - выпал, потому что Rodger от R это 1/6, или около 16.8% что менее 30%
// Sauron - выпал из поиска, R в слове, а не в начале имени
WHERE user.name LIKE searchInput%
AND WHERE userNameSimilarity > 30%
ORDER BY user.name DESC
ORDER BY userNameSimilarity % DESC
Answer the question
In order to leave comments, you need to log in
Neighborhood in the alphabet is a so-so criterion. Most of the typos are due to the proximity of the buttons on the keyboard (and do not forget different layouts), incorrect input "by ear", or simply illiterate users. For this, fuzzy search algorithms (or fuzzy search) have long been developed, including taking into account pronunciation (soundex) and for large texts (anti-plagiarism systems, shindles, etc.).
Postgresql has built-in functions for calculating the Levenshtein distance and Soundex (and a few others). There is some plugin
to calculate Hamming distance .
There are a lot of cool fuzzy searches in sphinx (which you can use, including storages directly from pg ). On large tables, as a rule, it will greatly outperform a self-written solution.
Unfortunately, a significant part of ready-made solutions is sharpened into English, and there are more difficulties with other languages.
Didn't find what you were looking for?
Ask your questionAsk a Question
731 491 924 answers to any question