V
V
Vasily G.2020-02-17 00:06:54
Sphinx
Vasily G., 2020-02-17 00:06:54

Smart search in a large database. What to choose?

Hello!

I'm looking for a ready-made service to work with a constantly growing database.

It is necessary to create a database, where ~ 15k will be added daily (even more later).

Two scenarios for working with the database are planned:
1. Adding new records: if the row does not exist in the database, then add it to the database.

2. Search in the database:
- search in text fields for given keywords/phrases in different languages ​​(taking into account morphology)
- search by numeric fields
- sorting by date of addition

Here is an approximate entity structure:
- title (text field)
- description ( text field)
- quantity in stock (number)
- quantity in store (number)
- categories (an array of numeric values, category ids)

If you describe the query in words, it will look something like this:
Get all records for which the following conditions are met:
1. The title or description contains the words apple, orange
2. The title or description does NOT contain banana
3. If there is a warehouse and a quantity > 10 or a quantity in the store > 1 5.
In the categories ID:10, ID: 20, ID:30

that were added more than 72 hours ago.

I thought to develop, for example, on MySQL + Sphinx, or Elasticsearch, but it is long and expensive, and besides, you need to constantly support it.

I looked towards ready-made solutions and SaaS services. I tried Elastic App Search, everything was fine, but I ran into 2 limited problems:
1. if the key phrases are in different languages, the morphology does not work.
2. there is a limit of 128 characters in the search query. The forum suggested that there should not be such long search queries in principle, and I don’t use the service for that.

Are there really no ready-made solutions to create an entity, set fields and add records using the API and search for them?

Answer the question

In order to leave comments, you need to log in

1 answer(s)
X
xmoonlight, 2020-02-17
@xmoonlight

An index (book-like) of record id/row ranges for each language and for each unique term from all rows for each language.
When you change the base - always update the pointer.
The order of the search conditions from your example (out of 5 items) is correct. This is a valid "waterfall" sampling filter.

Are there really no ready-made solutions to create an entity, set fields and add records using the API and search for them?
Of course there is, but not in the public domain and not for free.

Didn't find what you were looking for?

Ask your question

Ask a Question

731 491 924 answers to any question