ELASTICSEARCH how to replicate more "elegantly" via the ANSIBLE playbook?

G

gremlintv22019-02-19 12:25:51

elasticsearch

gremlintv2, 2019-02-19 12:25:51

Hello,
After setting up the MASTER -> SLAVE replica on POSTGRES , it turned out that there is a need to create a similar type of replication for ELASTICSEARCH .
Googling a bit, I found out that ELASTICSEARCH only supports cluster multimaster replication ( MASTER - MASTER ELIGIBLE ).

Why does not the method through the creation of snapshots suit me.

Но для подобия MASTER -> SLAVE на POSTGRES можно испльзовать механизм создания снэпшотов. С переодическим созданием/востановлением снэпшота через крон. Но как я понял для такой схемы необходимо:

создать снэпшот,
остановить ELASTICSEARCH на ноде "SLAVE" (и в этот момент он не доступен для клиента)
востановить снэпшот
запустить ELASTICSEARCH

Данная схема не подходит, так как необходима постоянная доступность ELASTICSEARCH для приложения.
Значит наиболее подходящим в данном случае будет способ предложеный командой ELASTIC: MASTER - MASTER ELIGIBLE

When adding a new node to the MASTER cluster - MASTER ELIGIBLE - it is necessary to run on all existing nodes including the master via ANSIBLE

following

добавить в hosts и elasticsearch.yml запись нового хоста
изменить правила firewall
перезагрузить сервис elasticsearch

There is also a possibility of changing the IP of the node (the specifics of the project), which requires the creation of an additional playbook with the same tasks.
Such manipulations raise certain concerns:
1) Will it cause a split brain if something goes "wrong" (temporary unavailability of the IP of one of the nodes)
2) Crash of the entire cluster after restarting ELASTICSEARCH (temporary unavailability of the IP of one of the nodes)

1)How "warn" these problems (see above)?
2) Maybe there is a more elegant way? (In my opinion, Postgres is more convenient, despite the single point of failure).
I found a tutorial , but the questions are the same, plus an additional one was added - 3) Is tinc reliable, which can cause it to crash, will it be possible to configure it in an OPENVZ container?

Reply

Answer the question

In order to leave comments, you need to log in

1 answer(s)

A

Alexey, 2019-02-19
@gremlintv2

You are wrong, I have never reloaded nodes on prod after adding new ones.
Everything is done on the fly:
In the config of the new node, you register all the nodes, after you start it, it enters the cluster and starts syncing (it is better to do it when the load on the cluster is minimal, since there will be a lot of copying), the sync will depend on the settings for the distribution of shards and replicas. Accordingly:
1. Remove all restrictions on the network (configure firewall rules (Add, fix, etc.)), if they exist
2. Launch a new node, it will enter the cluster itself and sync
3. Add a new one to the existing ones in the config.
Regarding your split-brain questions, in the elastic settings there is a setting for the minimum number of nodes to work: discovery.zen.minimum_master_nodes: number
This allows, for example, you have 5 nodes, you keep 1 primary shard on each node and 2 replicas of each shard on other nodes. With the discovery.zen.minimum_master_nodes: 3 setting, you can always take 2 servers out of service (for maintenance), while the cluster will turn yellow but will give data (not so fast, though, performance degradation)