Answer the question
In order to leave comments, you need to log in
How to implement a search by last name in the database?
Let's say there is a table with two fields: id
and data
. The field data
stores arbitrary text in Russian, which can contain names and surnames. You need to implement a search on this table:
Search requirements:
pg_tgrm
, doesn't work as it should:SELECT to_tsvector('russian', 'Анна Иванова') @@ to_tsquery('russian', 'иванов') -- false
SELECT to_tsvector('russian', 'Иван Иванов') @@ to_tsquery('russian', 'иванова') -- false
SELECT similarity('иванов', 'иванова') -- 0.66
SELECT similarity('иванов', 'ивановым') -- 0.6
SELECT similarity('иваныч', 'иванович') -- 0.33
client.CreateIndex(
index,
m => m.Mappings(mp =>
mp.Map<Page>(mx =>
mx.Properties(p =>
p.Text(x =>
x.Name(f => f.Title)
.Analyzer("index_ru")
.SearchAnalyzer("search_ru")
)
)
)
)
.Settings(s =>
s.Analysis(a =>
a.CharFilters(c =>
c.Mapping("filter_ru_e", z => z.Mappings("Ё => Е", "ё => е"))
)
.Tokenizers(t =>
t.NGram("n_gram", ng =>
ng.MinGram(4).MaxGram(20)
)
)
.Analyzers(an =>
an.Custom("index_ru", ac =>
ac.CharFilters("html_strip", "filter_ru_e")
.Tokenizer("n_gram")
.Filters("stop", "lowercase", "russian_morphology", "english_morphology")
)
.Custom("search_ru", ac =>
ac.CharFilters("html_strip", "filter_ru_e")
.Tokenizer("standard")
.Filters("stop", "lowercase", "russian_morphology", "english_morphology")
)
)
)
)
);
var docs = new []
{
new Page("Иван Иванов"),
new Page("Петр Иванов"),
new Page("Илья Иванов"),
new Page("Светлана Иванова"),
new Page("Анна Иванова"),
};
foreach(var doc in docs)
client.Index(doc);
var query = client.Search<Page>(
s => s.Query(
q => q.Match(
f => f.Field(x => x.Title)
.Query("иванов")
)
)
);
Answer the question
In order to leave comments, you need to log in
and what do you think? Why hasn't Cortana been released in Russian yet?
ps
this is not about the uniqueness of the Russian language, it's just that the topic has not yet become trivial, at this stage of IT development,
look at the services and developments of the service https://dadata.ru/ suddenly something will come in handy
It's strange, I have a russian analyzer for Russian fields, out of the box, it searches for word forms quite well, but I haven't tested it on surnames. Also try fuzzy query.
Right now I checked: " swan sensor ", finds "Lamp Swans night light with light sensor Cosmos".
I am looking for sites , I get "A gray platform and a round black handle with a lock"
I am looking for a lamp , finds both a lamp and a lamp and lamps , and a lamp .
I am looking for a nail , I get a plastic staple with a nail
PS. Elasticsearch 5.1.1 if cho. I don’t specifically install any plugins with Russian morphology from version 2.x
Didn't find what you were looking for?
Ask your questionAsk a Question
731 491 924 answers to any question