Z
Z
zednight2013-05-22 12:59:17
Sphinx
zednight, 2013-05-22 12:59:17

How to explain to sphinx that sleeve and mitten are different words?

I'm setting up sphinx search for an online store. The query "mitt" finds both mittens and all products with the mention of the sleeve. At what, apparently, during the index process, the root "sleeve" is selected, because even with relevant sorting, "mittens" are not all issued at the beginning. How to make it look for sleeves and mittens separately?
config:
index productsindex
{
source = products
path = /var/lib/sphinxsearch/data/products
docinfo = extern
mlock = 0
morphology = stem_en, stem_ru
wordforms = /var/lib/sphinxsearch/data/wordforms.txt
min_stemming_len = 3
min_word_len = 3
charset_type = utf-8
charset_table = 0..9, A..Z->a..z, _, a..z, U+410..U+42F->U+430..U+44F, U+430..U+44F
expand_keywords = 1
min_prefix_len = 3
#min_infix_len = 3
index_exact_words = 1
prefix_fields = name
#infix_fields = name
enable_star = 1
html_strip = 1
}

Answer the question

In order to leave comments, you need to log in

3 answer(s)
Z
zednight, 2013-05-23
@zednight

I asked sphinx on the forum, they recommended using a dictionary of word forms in the format:
Mitten > Mitten Mittens
> Mitten
In general, I had to add declension in all cases in the singular and plural, in fact it is: 7 lines
It seems to help.

B
becks, 2013-05-22
@becks

Maybe you can make an index not by stemma (the words sleeve and mitten have the same stemma), but by lemma, connect a morphological dictionary? In the new version, it is already supported.

X
XaosSintez, 2013-05-22
@XaosSintez

In general, there is an index_exact_words parameter for this. But as I see, you have already included it.
So you need to deal with the ranking'om, so that it gives a more accurate match above.
Try to get smart with ranking_mode

Didn't find what you were looking for?

Ask your question

Ask a Question

731 491 924 answers to any question