Answer the question
In order to leave comments, you need to log in
How to efficiently find all objects that have all the given words in their name?
Specifications Input
Let's say I have an array of stores. A store is an object that has a name field with the name of the store, for example: {name: "Hardware store", id: 141, ...}. There are thousands of shops. I also have an array consisting of combinations of two words, for example: .
Task
It is necessary to find all stores where both words from some combination occur in the name.
A head-on solution is to loop through the combinations and for each combination iterate over the entire array of stores, but there is probably a more optimal solution.
Answer the question
In order to leave comments, you need to log in
Of course have. You can increase the speed by using memory.
For example, you can make an associative array (dictionary, hash table), in which the keys will be individual words ("construction", "repair", etc.), and the values will be lists of stores in which this key occurs. Then, when searching for the next pair of words, you will need to select two short lists and sort through them, finding common elements (i.e. stores with both words in the name at once).
If you think about it, you can still optimize something. But it is already necessary to know more nuances of your specific task.
Here you can go shopping in one pass. First go through your combinations and add the words into a hashmap "word" -> list of numbers of pairs that contain it.
Then, for each store, mark the array with the length of how many pairs of tags you have with zeros. Then go through each word and add 1 to each pair in the array in which this word occurs, using the map from the previous paragraph. If somewhere you got 2 in the process, then this store is right for you.
If you can pre-calculate something for all stores once, then you can then execute "requests" without even going through all the stores.
For each tag, you can make a list of stores that have it in the name, sorted somehow (for example, by store id). This structure can be built in one pass through all the stores and saved.
Your request for tag pairs can then be processed by concatenating and traversing the ordered lists. These operations, as in merge sort, are done in one pass through the list.
In the worst case, where a very frequent tag is used, the processing of the request will be almost like a complete scan of all stores (even if a small part of them are returned), but on average this method will work well.
Walking through the shops, each name is divided into words, for each word the basic morphological form is found. Save records (shop id - word id). In the search for pairs, we also bring both words to the base. You still need not "construction, paints", but all possible declensions of these words. And the selection "find records where at least two lines match the key" is elementary.
Didn't find what you were looking for?
Ask your questionAsk a Question
731 491 924 answers to any question