Answer the question
In order to leave comments, you need to log in
How to download and store media data of users on different servers?
I'm moving towards learning and applying development methods for High load projects. Several questions arise.
There is some application in PHP, nginx, postgresql with a table of users. Each user can upload some media content such as pictures, videos, documents. Now this is all stored in one folder on the server, each file is given a unique name by hash and by different folders based on the user id.
1. 2 servers are allocated for storing files, I need to distribute the download of files among them and, accordingly, for each file, I need to know on which server it is stored? How to make a partition and on what basis? Do I need a balancer to return media data, or just write the full path with the server when saving files?
2. How to implement the launch of 2 web servers with the same versions of applications and balance between them? Can I get by with nginx or do I need HAproxy? Do I understand correctly that in order to connect 3 servers, you will also need to deploy the same version of the application on it, but if you need to add 10 servers and balance between them?
3. A question about horizontal scaling, suppose that the database will grow, and it will be necessary to distribute the load between separate database servers?
And here are the questions, write the balancer yourself to scatter the creation of new users across databases or use PgBouncer in conjunction with PgPool? What other options are there?
Answer the question
In order to leave comments, you need to log in
1. 2 servers are allocated for storing files, I need to distribute the download of files among them and, accordingly, for each file, I need to know on which server it is stored? How to make a partition and on what basis? Do I need a balancer to return media data, or just write the full path with the server when saving files?
1. If there are a lot of files physically, or there is a lot of traffic to them, then spread them as you wrote.
Split on the basis of "one here, the other here", that is, 50/50%.
A balancer is needed when you have one type of content on many servers at the same time, and you distribute the load between them.
And here you have unique content on each server, so there is nothing to balance.
Just contact this server directly (s1.example.com, s2.example.com)
2. nginx will do a great job of balancing.
Yes, when adding a new node, make sure that there is an absolute copy of the application.
Usually this is done through the reference image server, and when creating a new node, this computer image is simply selected - and voila, you already have a finished node.
Whether you have 2 servers, or 20 - nginx will do a great job balancing according to your desires.
3. I don't have much experience with postgri. I can speak from MySQL experience.
The latest versions of mariadb have galera cluster, which is a full-featured mysql master-master cluster.
Perhaps pg has an equivalent.
The bottom line is that the traffic on the database is scattered across the nodes, records of the SELECT type are executed on different machines, the rest of the queries on all machines at once.
A number of nodes work with one database, the other with another, while the data they have is the same.
Or sharding. "Manual" distribution of data between multiple servers.
Before thinking about creating multiple database nodes, determine that this particular area of the application is the bottleneck.
It is much easier to fasten the cache than to organize work with several databases.
Didn't find what you were looking for?
Ask your questionAsk a Question
731 491 924 answers to any question