P
P
pandaa2020-06-29 17:04:04
MySQL
pandaa, 2020-06-29 17:04:04

How to search for phrases in an inverted index?

5ef9f0a04c3f6781438716.jpeg
Using the query "blue sky" as an example, the search index returns all documents for the word blue and for the word sky. Further, we can simply find the same from these documents, and these will be those documents on which there is the phrase blue sky.

But selecting identical documents is a very resource-intensive operation, because you have to go through all of them, and this can be millions of different documents.

Answer the question

In order to leave comments, you need to log in

3 answer(s)
I
Ivan Shumov, 2020-06-29
@inoise

Term in this case should be 2 words. no magic

D
Dimonchik, 2020-06-29
@dimonchik2013

download
https://www.ozon.ru/context/detail/id/5497130/ there are answers to all questions initially
scanned on the net - everything is like you write an index - this is a list of documents that includes words two words - two lists of their intersection - the usual XOR, xs who said that it is resource-intensive , but for the answer to be RELEVANT - there is a lot of stuff from above, which Ivan Shumov is trying to explain to you, and yes, in modern search, the index is a list of documents corresponding to the VECTOR of the request, no one online does it - everything is pre- computed online just trying to reduce the query to the best (several) pre-computed vectors

X
xmoonlight, 2020-06-30
@xmoonlight

Using the query "blue sky" as an example, the search index returns all documents for the word blue and for the word sky. Further, we can simply find the same from these documents, and these will be those documents on which there is the phrase blue sky.
no. Will contain both blue and sky at the same time. But not only the given phrase: blue sky.
Create a "tree" of connections of all words (following vector) and look for the necessary "chains" (phrases) in this "tree" with any maximum "distance".
When importing a new document - supplement / update the "tree" of links.

Didn't find what you were looking for?

Ask your question

Ask a Question

731 491 924 answers to any question