D
D
DeusModus2011-02-18 13:32:04
Sphinx
DeusModus, 2011-02-18 13:32:04

Sphinx and strange behavior when using exceptions

I read a lot in the documentation about word forms, stop words and exceptions and decided to improve the search.
Having some problems with wordforms:
Word: year2000
year 2000 > year2000 | everything is cool

Word: year 2000
year2000 > year 2000 | everything is bad, the results seem to be only by year

and

exceptions:
Word: year 2000
year2000 => year 2000 | everything is bad

As I understand it, the sphinx has some difficulties with spaces on the right side of expressions. I've tried quotes, they don't help.
Example from the docs on exceptions:
AT & T => AT&T
AT&T => AT&T
Standarten Fuehrer => standartenfuhrer
Standarten Fuhrer => standartenfuhrer
MS Windows => ms windows
Microsoft Windows => ms windows
C++ => cplusplus
c++ => cplusplus
C plus plus => cplusplus


As far as I understand, the phrases Microsoft Windows and ms windows should produce the same number of results. On my Sphinx 0.9.9-release (r2117), with SPH_MATCH_EXTENDED and SPH_RANK_PROXIMITY_BM25, a query for "ms windows" returns nothing (as does year 2000).

Answer the question

In order to leave comments, you need to log in

1 answer(s)
O
Oleg, 2021-01-12
Ekhlakov @XOlegator

The sphinxsearch.com/docs/manual-2.3.2.html#conf-wordforms documentation says that multiple tokens are allowed for the wordforms dictionary on the right side of a rule since version 2.2.4. Checked that spaces are allowed, but other delimiters from the blend_chars parameter are not allowed.

Didn't find what you were looking for?

Ask your question

Ask a Question

731 491 924 answers to any question