E
E
EchoStan2019-07-25 11:14:01
System administration
EchoStan, 2019-07-25 11:14:01

What does a normal build/monitor system look like now?

Hello! The web project in which I work is developing and it has become somehow difficult for us with the deployment system. Now I am rolling out shell scripts via ssh, building a project from several git repos (front, back, a couple of landing pages, etc., lying separately), and committing a new build to a separate branch before deploying.

The environment is the simplest, a few bare vps with ubuntu from DO. A node is spinning there under the supervision of pm2, services communicate with each other via tcp. Yes, not a monolith. Periodically, one of Ubuntu wants more attention and provides me with a couple of unforgettable hours.

In case of an error in testing on pre-production instances (on which, of course, a bolt is sometimes put) or any other problems with the assembly, I am not protected by anything, I cannot even restore the previous assembly and environment exactly - because there is no virtualization / containerization and the state of the OS changes from assembly to assembly (example - a new dependency that requires a global installation).

I remember with nostalgia the days when the application was a small MVP and hosted on PaaS heroku.

Please tell me which instruments to start reading the spec first of all? I want to borrow from heroku the feature "secure auto-build by push to master with auto-rollback for any problem". And with the command "roll back to this assembly" which is even displayed in the
GUI on Heroku.

What do smart people use now when they solve such problems? The direction of thought is as follows:

- Health checking of the business logic layer. Now everything is limited to pm2, which restarts the fallen application and itself is in the autoload of the host OS. I would like to link health checking with deployment in order to roll back automatically if the application rises crookedly, and the environment is in order. Now I manually respond to telegram alerts from Uptime Robot.

- Docker for containerization. Should the versioned final image include only my code, or also some node:latest (which, it seems, includes debian itself)? In the latter case, the image will weigh 900mb, with hundreds of kb of client code - ridiculous. In theory, you only need to store instructions for docker, right? Is it a dockerfile in my case?

- Actually, versioning based on git repo with source code and image/instructions for docker. Write a script to manage this stuff yourself and run it in a separate vps or deal with some gitlab?

How to solve problems with the host itself if the OS suddenly gets tired? There are some monitoring tools that can reboot/recreate the VPS (hosting is do now, but we are not tied).

I know that there are chef, puppet, k8s (and managed k8s, heh) in the world. Does any of this answer the task 100% in order to concentrate on studying and get a production-ready result in 2-3 weeks? I'm learning fast.

Answer the question

In order to leave comments, you need to log in

3 answer(s)
C
chupasaurus, 2019-07-25
@chupasaurus

Lots of questions, lots of answers.

  1. Health checking - display metrics for checking on a separate URL and monitor. If it is possible to make a binary metric that returns HTTP 200/500, then docker/cri-o/other runtimes can track the status themselves.
  2. The Docker image consists of a manifest that describes the exported ports, anonymous volumes and metadata, and most importantly, a list of data layers. When upgrading using the Docker Registry, layers already present will not be downloaded instead of copying the blob.
  3. Iaac. Dockerfile + build scripts if necessary in code + build system. It is extremely convenient to store images in the Registry, you can set tags by commit id if you need to nail them directly to the VCS.
  4. There are 2 ways: external monitoring + configuration management system (the first pushes on an alert to the second, which creates a new server on the side and silences the failed one) or an orchestrator that resolves such problems on its own.
  5. About the introduction of new servers: env is never dynamic, for this purpose they use dynamic DNS servers (wrapping with the beautiful name Service Discovery), balancers and message queues.
  6. Examples of solutions from personal practice. Without an orchestrator: in AWS it can be implemented on SNS + Autoscaling, universally - on Prometheus / Alertmanager or Zabbix or Nagios, which will trigger jobs in Ansible Tower with alerts (its open source AWX version comes with all the features of the Enterprise version), but it’s better to still have something in between for more control over what is happening. Everything is simpler with k8s: under Prometheus everything is already there, the system itself monitors resource consumption and you can set limits on CPU / RAM, just configure the scalability of work nodes, but there is a small nuance - everything should already be containerized; in DO, by the way, a very adequate managed cluster.

V
Viktor Taran, 2019-07-25
@shambler81

monit-it.ru
To start, this is
one of the features of
support for nagios plugins
and, most importantly, it connects via ssh and can execute the command to return the code to draw conclusions.
Well, and especially important with Aliard, you can execute a console command.
And most importantly, it’s cloudy crap and you don’t need to monitor the monitoring system itself.
Yandex obviously won’t work for monitoring the infrastructure, but it’s normal for its projects.
I monitor about 20 servers, 600 sites in it, in principle, enough.
And most importantly, as you asked for a minimum entry level.

V
Vasily Shakhunov, 2019-08-01
@inf

1. Put everything in the docker. Google tours of minimal alpine containers depending on the framework
2. Do tests without deployment in some gitlab-ci
3. Orchestration is the simplest swarm from the box of the docker itself,
2-3 weeks will not be enough))

Didn't find what you were looking for?

Ask your question

Ask a Question

731 491 924 answers to any question