M
M
MrFavour2022-01-29 01:40:14
elasticsearch
MrFavour, 2022-01-29 01:40:14

How to search both in part of a word and in full and with errors in Elasticsearch?

Good afternoon!
I'm doing a regular full-text search on a blog, products, etc. using elasticsearch

Example: I'm looking for the word "two-chamber"
The problem is that when using the query_string method , less than a word without an ending is searched by coincidence, that is,
two, two-cam, two-chamber - finds two-chamber two-
chamber - no longer finds

The second option when using multi_match is the opposite - everything up to the word without an ending is not searched for
two, two-cam, two -cam - does not find two-chamber two-
chamber - finds

I tried to include n grams and various variations of the index setting. I'll leave the settings below!

How to make it search both by part of a word and by a word with an ending and by a word with errors?

Request example:

'query_string' => [
                        "fields" => ["title", "content"],
                        "query" => "*двухкамер*",
                        "fuzziness" => "AUTO",
                    ]

Index settings:
'settings' => [
                    'number_of_shards' => 1,
                    'number_of_replicas' => 0,
                    'analysis' => [
                        'filter' => [
                            "shingle" => [
                                "type" => "shingle",
                            ],
                            "ru_stop" => [
                                "type" => "stop",
                                "stopwords" => "_russian_"
                            ],
                            "ru_stemmer" => [
                                "type" => "stemmer",
                                "language" => "russian"
                            ],
                            "en_stemmer" => [
                                "type" => "stemmer",
                                "language" => "english"
                            ],
                            "my_n_gram" => [
                                "type" => "edge_ngram",
                                "min_gram" => 3,
                                "max_gram" => 30,
                            ]
                        ],
                        'char_filter' => [
                            'pre_negs' => [
                                'type' => 'pattern_replace',
                                'pattern' => '(\\w+)\\s+((?i:never|no|nothing|nowhere|noone|none|not|havent|hasnt|hadnt|cant|couldnt|shouldnt|wont|wouldnt|dont|doesnt|didnt|isnt|arent|aint))\\b',
                                'replacement' => '~$1 $2'
                            ],
                            'post_negs' => [
                                'type' => 'pattern_replace',
                                'pattern' => '\\b((?i:never|no|nothing|nowhere|noone|none|not|havent|hasnt|hadnt|cant|couldnt|shouldnt|wont|wouldnt|dont|doesnt|didnt|isnt|arent|aint))\\s+(\\w+)',
                                'replacement' => '$1 ~$2'
                            ]
                        ],
                        'analyzer' => [
                            'default' => [
                                'char_filter' => ["html_strip", "pre_negs", "post_negs"],
                                'type' => 'custom',
                                'tokenizer' => 'standard',
                                'filter' => ['lowercase', 'trim', 'ru_stemmer', 'en_stemmer', 'ru_stop', 'my_n_gram']
                            ]
                        ]
                    ]
                ],

Answer the question

In order to leave comments, you need to log in

Didn't find what you were looking for?

Ask your question

Ask a Question

731 491 924 answers to any question