V
V
Vyacheslav Saratovsky2015-03-23 14:30:38
Sphinx
Vyacheslav Saratovsky, 2015-03-23 14:30:38

Sphinx or ElasticSearch?

It is necessary to implement a full-text search in documents on the site, taking into account morphology. 500gb of text files (.txt), the weight of one file does not exceed 5mb. File encoding UTF-8. Approximately 100mb are added daily and approximately the same amount is removed. Most of the files are in Russian. Load about 40-70 requests per minute.
What is the best way to solve this problem. Now we are considering Sphinx and ElasticSearch. What is the best search engine for this task? What are their weak points? What problems might arise?

Answer the question

In order to leave comments, you need to log in

3 answer(s)
Y
Yuri Shikanov, 2015-03-23
@super-developer

ElasticSearch is very flexible because allows you to change the data scheme in the process of work and has a huge number of other features. You need to communicate with it using the REST Api with data exchange in JSON format.
Sphinx is quite simple, all configurations are written in a file (including the data schema) and then you can work with it using some add-on over SQL.
IMHO, in your case ES will be redundant and I would advise you to choose Sphinx.

U
un1t, 2015-03-23
@un1t

Both will work for your task. Sphinx indexes faster and eats less memory, Elasticsearch is more flexible, can build all sorts of facets and other aggregations, can index arrays, nested fields, and so on. In your case, I would also recommend Sphinx, because. initially indexing 500 GB on Elasticsearch will be very long.

P
Puma Thailand, 2015-03-24
@opium

both will cope
in the context of your task none

Didn't find what you were looking for?

Ask your question

Ask a Question

731 491 924 answers to any question