How is a completely uninterrupted operation of the server ensured?

I

Ilya bow2018-08-25 01:19:23

System administration

Ilya bow, 2018-08-25 01:19:23

Situation 1:
A server with a very important program that is demanding on disks, both in terms of speed and volume.
Then a brick fell on the server (it broke through) or a priest lit it up, in general the server died, the software also does not work. How to avoid this?
Situation 2:
The same - server + important software.
A storm came, fried one substation with lightning, and tore off the wires from the second. Two cunning guards drained half a tank of diesel from the generator each, the UPS worked fine and kept the servers running, and now it remains to pray that the electricians fix the light, looking at how quickly the % of the battery charge decreases ....
How to avoid this?

Reply

Answer the question

In order to leave comments, you need to log in

5 answer(s)

H

hx510b, 2018-08-25
@hx510b

Option number 1 - creating a failover cluster - two physical servers work in pairs, while one server does the work, and the second server works in reserve, while receiving an up-to-date copy of the data from the first server, done by different tools. In the event of the death of the first server, the second one takes over the load.
Option number 2 - applicable for websites - user requests are sent to servers according to certain rules to several servers, if one of the servers fails, the load increases on the remaining ones.
Option number 3 - geographically dispersed duplicate services - the most reliable option, but it is very difficult to make a cluster over long distances - there are problems with bandwidth, transmission delay and temporary communication interruptions - not all protocols operating in the local network are able to cope with this problem.
In general, the problem is solved using well-known solutions, taking into account the specifics of the problem being solved and the existing architecture of the service.
There is no simple solution - there is no panacea for all problems.

S

Sanes, 2018-08-25
@Sanes

Synchronization and distribution by region. High availability or fault tolerance service.
In any case, a minimum of double the excess power is required.

P

Puma Thailand, 2018-08-25
@opium

Just servers in different data centers

V

Valentine, 2018-08-25
@vvpoloskin

Adult uncles take out a program (service) to a virtual machine, the virtual machine rises on a cluster of physical servers with hypervisors, in the event of a failure of one physical server, the virtual machine moves through some vmotion to another, and the hard drives are in a remote storage connected via fibrechannel. Naturally, each physical server is connected to different switches and routers. This is how it works within the framework of one room (data center), for geographically dispersed data centers one must already think based on the infrastructure of these data centers. And if this is not enough, then we consider how much downtime is allowed per month, what is the allowable recovery time, and based on this, we carry out a number of measures.

�

⚡ Kotobotov ⚡, 2018-08-25
@angrySCV

at the level of 1-2 servers, you still won’t make a high fault tolerance that would withstand force majeure / natural disasters.
For such fault tolerance, you need to:
1. Separate the software from the hardware using virtualization, packing the software into a container.
2. Place the container in the cloud, where such tasks will be automatically solved at the "orchestration" level of containers.