Answer the question
In order to leave comments, you need to log in
Full text search MySql or Sphinx?
I have a lot of data in Mysql(~100GB) and MongoDb(~50GB). You need to quickly search for them using full-text search.
At the moment I am using Sphinx for this. But recent versions of Mysql support full text search. So I'm thinking of overtaking all the data in Mysql and abandoning Mongodb and Sphinx.
Who thinks about this? Will I significantly lose in search speed? Is it worth it to do so?
Answer the question
In order to leave comments, you need to log in
Full-text search is quite primitive.
Everyone has. The difference is only in the nuances.
1. The text is divided into separate words, short and auxiliary words are discarded.
2. Words are run through stemming (endings are cut off) snowball.tartarus.org/algorithms/russian/stemmer.html
3. According to the words, an index is built something like this roaringbitmap.org
Everything - MySQL, PostgreSQL, SphinxSeach, ManticoreSearch, ElasticSearch - work according to this algorithm when it comes to full-text search.
The quality of the search rests mainly on paragraphs 1 and 2. Plus, manual sharpening (an additional dictionary, etc.)
The search speed rests on paragraph 3.
There are small differences. For example, ElasticSearch can work with an index that is stored on a cluster of several servers. Thus, it is not limited in the size of the index as hard as SphixSearch (where it is important that the data is located on the same server).
On the other hand, SphinxSearch and its fork ManticoreSearch are extremely speed-focused . In particular, they adopt the paradigm of ignoring index build errors as much as possible. All for the sake of speed.
MySQL and PostgreSQL - do not have any advantages in terms of speed (like Sphinx/Manticore) or index size (like ElasticSearch). Their advantages are ease of use if your data is initially stored in a relational DBMS.
No, you won't get a speed boost when switching to MySQL from Sphinx. Sphinx is faster. From imprisoned precisely for speed.
Another thing is that, perhaps, you will not need such a high speed sharpening from Sphinx. Perhaps the convenience of storage in a relational MySQL DBMS will outweigh.
And yes, it is not clear why you need MongoDB. SphinxSearch has long been able to store the data itself, and not just the search index itself. Additional call to MongoDB after the document has already been found in SphinxSearch - reduces performance. Maybe MongoDB is handy for some kind of work, like initiating a full-text index build. But actually in the process of full-text search - it is an extra link.
Better than a separate server in the form of Sphinx or Elasticsearch has not come up with anything yet.
The last 10 years? He's already covered in mold.
"Recently" only added support for InnoDB in 5.6, and that was most recently in 2012.
Didn't find what you were looking for?
Ask your questionAsk a Question
731 491 924 answers to any question