Answer the question
In order to leave comments, you need to log in
How to search both in part of a word and in full and with errors in Elasticsearch?
Good afternoon!
I'm doing a regular full-text search on a blog, products, etc. using elasticsearch
Example: I'm looking for the word "two-chamber"
The problem is that when using the query_string method , less than a word without an ending is searched by coincidence, that is,
two, two-cam, two-chamber - finds two-chamber two-
chamber - no longer finds
The second option when using multi_match is the opposite - everything up to the word without an ending is not searched for
two, two-cam, two -cam - does not find two-chamber two-
chamber - finds
I tried to include n grams and various variations of the index setting. I'll leave the settings below!
How to make it search both by part of a word and by a word with an ending and by a word with errors?
Request example:
'query_string' => [
"fields" => ["title", "content"],
"query" => "*двухкамер*",
"fuzziness" => "AUTO",
]
'settings' => [
'number_of_shards' => 1,
'number_of_replicas' => 0,
'analysis' => [
'filter' => [
"shingle" => [
"type" => "shingle",
],
"ru_stop" => [
"type" => "stop",
"stopwords" => "_russian_"
],
"ru_stemmer" => [
"type" => "stemmer",
"language" => "russian"
],
"en_stemmer" => [
"type" => "stemmer",
"language" => "english"
],
"my_n_gram" => [
"type" => "edge_ngram",
"min_gram" => 3,
"max_gram" => 30,
]
],
'char_filter' => [
'pre_negs' => [
'type' => 'pattern_replace',
'pattern' => '(\\w+)\\s+((?i:never|no|nothing|nowhere|noone|none|not|havent|hasnt|hadnt|cant|couldnt|shouldnt|wont|wouldnt|dont|doesnt|didnt|isnt|arent|aint))\\b',
'replacement' => '~$1 $2'
],
'post_negs' => [
'type' => 'pattern_replace',
'pattern' => '\\b((?i:never|no|nothing|nowhere|noone|none|not|havent|hasnt|hadnt|cant|couldnt|shouldnt|wont|wouldnt|dont|doesnt|didnt|isnt|arent|aint))\\s+(\\w+)',
'replacement' => '$1 ~$2'
]
],
'analyzer' => [
'default' => [
'char_filter' => ["html_strip", "pre_negs", "post_negs"],
'type' => 'custom',
'tokenizer' => 'standard',
'filter' => ['lowercase', 'trim', 'ru_stemmer', 'en_stemmer', 'ru_stop', 'my_n_gram']
]
]
]
],
Answer the question
In order to leave comments, you need to log in
Didn't find what you were looking for?
Ask your questionAsk a Question
731 491 924 answers to any question