Answer the question
In order to leave comments, you need to log in
Why are there strange characters in the index at the beginning of words?
I'm trying to force sphinx to search correctly with and without a hyphen, it doesn't work - with different settings, the result is different, but always wrong :)
When executing the command, indextool --dumpdict services | grep верх
I see the following result:
As you can see, the word "top" is repeated twice, first with some kind of symbol at the beginning and then fine.
When executed, indextool --dumpdict services | grep исетская
I get:
I'm trying to search for the phrase "top-Isetskaya", in the database this service is called "UK Verkh-Isetskaya (CJSC) (commercial services)", my task is to make it search equally well with and without a hyphen. At the same time, there are several more services in the database, the names of which contain "top-Isetsky" and "top-Isetsky", for some reason they are ranked higher than the one you are looking for.
Can these stubs at the beginning of keywords be the cause of my problems and how to deal with such behavior?
Here is my config, just in case:
index services
{
source = services
path = /var/lib/sphinxsearch/data/services
docinfo = extern
morphology = stem_enru
min_stemming_len = 1
min_word_len = 1
min_infix_len = 1
html_strip = 1
index_exact_words = 1
expand_keywords = 1
mlock = 0
charset_table = 0..9, A..Z->a..z, a..z, U+2C->U+2E, U+2E, U+0044, U+0046, U+0130, U+0401->U+0435, U+0451->U+0435, U+410..U+42F->U+430..U+44F, U+430..U+44F
}
Answer the question
In order to leave comments, you need to log in
if you want the original to prevail over everything else in the sphinx, there is an option for this, you can look in my article about the sphinx on Habré
Didn't find what you were looking for?
Ask your questionAsk a Question
731 491 924 answers to any question