Answer the question
In order to leave comments, you need to log in
Sphinx or ElasticSearch?
It is necessary to implement a full-text search in documents on the site, taking into account morphology. 500gb of text files (.txt), the weight of one file does not exceed 5mb. File encoding UTF-8. Approximately 100mb are added daily and approximately the same amount is removed. Most of the files are in Russian. Load about 40-70 requests per minute.
What is the best way to solve this problem. Now we are considering Sphinx and ElasticSearch. What is the best search engine for this task? What are their weak points? What problems might arise?
Answer the question
In order to leave comments, you need to log in
ElasticSearch is very flexible because allows you to change the data scheme in the process of work and has a huge number of other features. You need to communicate with it using the REST Api with data exchange in JSON format.
Sphinx is quite simple, all configurations are written in a file (including the data schema) and then you can work with it using some add-on over SQL.
IMHO, in your case ES will be redundant and I would advise you to choose Sphinx.
Both will work for your task. Sphinx indexes faster and eats less memory, Elasticsearch is more flexible, can build all sorts of facets and other aggregations, can index arrays, nested fields, and so on. In your case, I would also recommend Sphinx, because. initially indexing 500 GB on Elasticsearch will be very long.
Didn't find what you were looking for?
Ask your questionAsk a Question
731 491 924 answers to any question