F
F
footballer2020-09-08 14:06:56
Database design
footballer, 2020-09-08 14:06:56

What is the benefit of sharding a database with indexes?

Sharding (horizontal partitioning) is a database design principle in which logically independent rows of a database table are stored separately, grouped in advance into sections, which, in turn, are placed on different, physically and logically independent database servers

If we have a table with millions of records and to search for specific data of a particular user in it, we enumerate all the data of all users (if there are no indexes), then yes - splitting the table and storing data for each individual user on different servers makes sense. But if we have indexes on the user ID field (and there are always indexes in large tables), then we will not go through all the data of all users - then how does such separate storage of data on different servers for each user help us?
Unless when updating / inserting data for one user, we need to update the data index of only this user, and not the general index for all (but the gain is not obvious here either - because although the index has become smaller and is updated faster, on each request we you need to additionally sort out from which shard to take data for a given user).
In general, what exactly is the benefit of sharding?

Answer the question

In order to leave comments, you need to log in

5 answer(s)
M
Melkij, 2020-09-08
@melkij

As long as you only have millions of records, then you don’t need sharding or even partitioning.

In general, what exactly is the benefit of sharding?

Scaling write operations when more productive disks and no longer deliver.

I
Ivan Shumov, 2020-09-08
@inoise

indexes work locally, Sharding - globally. As long as you have one small server, then Sharding will not help you, and when you have a cluster, it’s very

V
Vitsliputsli, 2020-09-08
@Vitsliputsli

In general, what exactly is the benefit of sharding?

Sharding is not intended to speed up data access, so it makes no sense to compare with indexes, at least not in the context you described. Shariding is a variant of horizontal scaling. When you can no longer increase the capacity of one DBMS server to meet the increased needs, then you will have to think about several DBMS servers, i.e. about horizontal scaling, and sharding is one of its variants.

V
Vladimir Korotenko, 2020-09-08
@firedragon

Let's say you have a disk of 100 gigabytes
User data 1 megabyte
Total 102400 users. Where are we going to put the rest?

Didn't find what you were looking for?

Ask your question

Ask a Question

731 491 924 answers to any question