MySQL sharding and shard search?

S

serious9112017-05-04 11:13:10

MySQL

serious911, 2017-05-04 11:13:10

Hello.
I'm currently working on a small Node.js application. The application uses MySQL (MariaDB) for data storage, Redis for caching, and Express.js. The application is still running on 1 server, but it became necessary to scale the application and distribute the data from the database to several servers accordingly. Began to deal with sharding and database replication.
At the moment, I came to the decision to choose horizontal sharding as the main tool for scaling data in the database. But since I don’t have much experience with sharding, I ask for advice here. How to split data into shards and how to store data is roughly clear, but some questions remain.
1) How to work from Node.js application simultaneously with dozens of shards (database servers)? Now I work with only 1 DB instance, and after sharding, will I have to make several connections to different DB shards and several requests for one request to the application API?
2) How to search/retrieve data from different shards? For example, on one page it is necessary to display 10 users at the same time (name, avatar, etc.), which are distributed over different shards. What to do in this case? You can also store important data in a cache (Memcache / Redis), but at the same time, in order to do a search / retrieve data, you also need to somehow cluster the caches. Is it possible to index data on shards using ElasticSearch, and then do a search?
3) What pitfalls should be paid attention to at the initial stage in order to avoid problems in the future?
Please share your own experience and advice.
Thank you.

Reply

Answer the question

In order to leave comments, you need to log in

2 answer(s)

F

FanatPHP, 2017-05-04
@FanatPHP

A classic question from the XY Problem series.
Instead of writing "I don't know what I need, but I imagined that I need sharding, so tell me how to use it" I need to write " I need scaling for such and such a reason and with such and such conditions. tell me the best option ".
And then you will be prompted with the answer, in order of the frequency of the reasons why this question is asked:
1. No, there is no need to be afraid of a million records. Mysql will pull many times more.
2. Make master-slave replication with the required number of slaves.
3. Perhaps, under some conditions, sharding is also suitable.
As expected, everything came down to point 1.

A

Artem Kustikov, 2017-05-12
@art1z

Sharding makes sense precisely when data is most often obtained from one shard. Pictures, for example, or statistics on various servers/services. The user, as the basis of the application, must physically lie in the same database. And even theoretically, the maximum possible 6 billion users in the database will take a couple of terabytes, which is not a problem for the muscle - https://habrahabr.ru/post/64851/