T
T
tutnet2014-01-30 09:16:03
PHP
tutnet, 2014-01-30 09:16:03

Ranking by several factors including zones in Sphinx. How to do it right?

There is an index from the header and html text, the following is written in the config for it:

min_infix_len = 3
html_strip	= 1
index_zones = h*, b, title, szone*
index_exact_words = 1

The task is to rank the issue according to several factors:
1a. Full entry into the title - the largest weight
1b. Partial occurrence in the title of at least 60% of the request (i.e. one word out of two does not fit)
2a. Full occurrence in the text with internal ranking by zones. Zones are set by many tags, including custom ones (i.e. index_zones = h*, b, title, szone*)
2b. Similarly, 1b for text, taking into account zones
3. Other partial occurrences in the heading
4. Other partial occurrences in the text
Items 3 and 4 can be neglected (that is, do not include them in the output)
For queries to the sphinx from the code, I use:
$cl->SetMatchMode(SPH_MATCH_EXTENDED2);
     $cl->SetRankingMode(SPH_RANK_SPH04); 
     $cl->SetFieldWeights(array('name' => 15, 'long_text' => 5));

This actually solves the raising in the issuance of a full occurrence in the title and has very little effect on the rest of the items.
Key questions:
1. How to include zones in the ranking?
2. How will this ranking work for nested tags (if for example h1 is inside szone10)?
3. How to take into account the partial occurrence of most of the query tokens when ranking?

Answer the question

In order to leave comments, you need to log in

Didn't find what you were looking for?

Ask your question

Ask a Question

731 491 924 answers to any question