Answer the question
In order to leave comments, you need to log in
How to set up stemming in Sphinx?
I am using Sphinx 3.0.3. I created a simple test file: The Белая корова. Белыми коровами. Белой коровой.
question is: Why does Sphinx find these phrases for the word "cow" (not in the text), but not for the words "cow" or "cow"? It turns out that not stemming (circumcision to cows) is used, but a dictionary in which there are no cows and cows?
The config so far is:
morphology = stem_enru
min_word_len = 2
index_exact_words = 1
expand_keywords = 1
min_infix_len = 3
min_prefix_len = 3
#enable_star = 1 #removed
#min_word_len = 1 #removed
#dict = keywords #removed
Answer the question
In order to leave comments, you need to log in
Stemming cuts off the endings, you need to go the other way, find an article about the sphinx on Habré in my profile, it tells the option for your case just
Here is my finished algorithm (in PHP) for fuzzy searching for strings of words with arbitrary beginning and end of words, including automatic correction of similar character styles to the desired language.
PS: By the way, here's something similar to Sphinx: stumper.ru (probably recently done)
Option 2: Cut out all suffixes from the search query using regex and the problem is solved:
-щик, -льщик
-анин, -янин
-ница, -тель
-льник
-ница
-ость, -есть
-ота, -ета
-ецо, -ице
-изна
-ство
-отня, -овня
-ство, -ество
-ина, -инка
-ёнок, -онок
-очка, -ечка, -ичка
-енька, -онька
-ушка, -юшка
-ышко
-ишко, -ишка
-ёнка, -онка
-инка, -енка
-ище, -ища
-ушк-, -юшк-, -ышк-
-ёнка, -онка, -ёнок, -онок, -юнок, -унок
-енька, -онька, -анька
Didn't find what you were looking for?
Ask your questionAsk a Question
731 491 924 answers to any question