I
I
Ilya Rodionov2019-03-31 22:32:55
ubuntu
Ilya Rodionov, 2019-03-31 22:32:55

Automatic copying of Docker volumes?

Colleagues, hello.
I'm starting to dive into docker, I ran into the following problems. Maybe you have an idea how to solve.
TODO:
Failover service on multiple physical hosts.
Now I started working through portrainer. I send him a docker-compose file in the docker stack, and he automatically spreads everything on several physical machines, depending on how many instances I need.
The problem and question is this.
Suppose my service is a file upload service. And when I go to the domain, I get to one of the working instances, I upload a file there, but this file does not automatically appear on other instances.
That is, in fact, I get n different copies of the site, because each will store its own information.
The question is this - how can you automatically "smear" the change in volumes in docker in the same way when changing files in it? And is it possible?
Thank you!

Answer the question

In order to leave comments, you need to log in

2 answer(s)
T
Tyranron, 2019-04-01
@Tyranron

It is necessary to separate stateless services and stateful services. You can distribute the first ones over several hosts and not worry, but the others you need to think about exactly how to scale them, and what guarantees are needed. On a good note, your application itself should be stateless, and not mount any directories where it would put files for long-term storage, but upload these files for long-term storage to another special file server, at least via the S3 bucket interface, at least via, God forgive me, FTP .
Another important question is what kind of guarantees do you need in this situation. If you have few files, then you can stupidly copy the file to each of the hosts (constant background synchronization). If there are a lot of files, then they will never fit on one host and you need to spread the files over several hosts with a certain level of redundancy (we don’t want to lose files forever when a disk crashes on one of the hosts).
What are/come to mind options:

  1. Raise a distributed file system on all hosts (CephFS, GlusterFS, etc.). We mount the volume application container under this system and stupidly write files there as usual. A distributed file system will independently spread files across hosts, depending on the desired settings. We read files from the same directory.
    Pros: no need to change the application code, easy to use, simple concept to understand.
    Cons: during intensive work with files, performance may not be enough (such file systems are considered slow), if some hosts fail, recording may not work (since a quorum of n / 2 + 1 is required), operation / support of such systems may not be the most trivial task ( fun fixing a broken Ceph cluster).
  2. Raise Minio on all hosts (your poor man's S3 bucket), or your other separate file server. It works in two modes: single (one node works only with its own files, writing to several must be done on the application side, and background synchronization is also done independently) and distributed (nodes are combined into a quorum cluster with spreading files).
    Pros: S3 bucket interface, easy operation, can be mounted as file system.
    Cons: you may need to change the application code (in order to be able to work with S3), read performance in distributed mode is slow (comparable to distributed file systems from point 1).
  3. Just mount a local directory on each host, for which you can set up constant background synchronization via BitTorrent Sync of some kind (or maybe even just rsync).
    Pros: the performance of a regular file system, no quorums (which means write locks), easy to use (mount and forward).
    Cons: files are available on other hosts only after some time, which means that the application must be able to take this into account, data loss is possible (the node accepted the files and burned out without having time to synchronize them to others), if the files do not fit on one host, then the option is not suitable, or you will have to shard (smear) with your own hands in the application.
  4. Use a ready-made fault-tolerant cloud: AWS Bucket, DigitalOcean Space, etc.

I
Ivan Shumov, 2019-04-01
@inoise

Take the storage of files to the cloud and do not boil your brain. I haven't worked with docker that much, but the volume thing, although ideologically interesting, is completely useless in distributed systems.

Didn't find what you were looking for?

Ask your question

Ask a Question

731 491 924 answers to any question