Distributed applications and user data or how horizontal scaling happens?

I

IDONTSUDO2020-01-13 23:48:08

linux

IDONTSUDO, 2020-01-13 23:48:08

How do social networks manage to make such distributed systems. I watched a video about the social network classmates with highload ++. And there it was said that they have about 11 thousand servers, and the traffic is about 1/tbs.
Rather, the question is what methods of balancing user requests are used in such huge systems that user data would not be lost yet? From the very beginning, I thought that everything is done through hash ip, this would be convenient and logical, but since hash is strictly tied to IP (that is, it balances depending on IP), and my IP can change. This makes no sense, since I can knock on another server through VPN from my city, and it turns out if the request on this server fails. Then I will have to either deny the user authorization, or look for data in a hundred other databases.
In fact, even if we take some kind of Casandr`y that knows how to distribute, we still run into the problem of user authorization.

Reply

Answer the question

In order to leave comments, you need to log in

3 answer(s)

V

Vitaly Karasik, 2020-01-14
@IDONTSUDO

Rather, the question is what methods of balancing user requests are used in such huge systems that user data would not be lost yet?

The short answer is that any server has access to a backend database that contains all the data. How the synchronization / replication of the database around the world is ensured is a separate interesting question, by the way, see the CAP theorem https://en.wikipedia.org/wiki/CAP_theorem.

G

Germanjon, 2020-01-14
@Germanjon

I'm sorry, are you interested in solving a specific problem or general theoretical materials? If the general theory, then it needs to be read and studied for a long time

V

Valentine, 2020-01-23
@ProFfeSsoRr

we still run into the problem of user authorization

just to authorize a million users is, roughly speaking, a database with 1 table and a million rows in it. That is absolutely nonsense.

or search for data in a hundred other databases.

Yes, hundreds of databases are not needed. Just replicas, sharding, and a lot of thinking about what, how and where to store.