How to restore the correct operation of the ElasticSearch database?

A

Alexander Yelagin2015-05-07 09:04:29

elasticsearch

Alexander Yelagin, 2015-05-07 09:04:29

Hello!
There is a base on elastic 1.5.0. The total number of data in all indexes is 1328134024. But after a while, the request to count the number shows ( localhost:9200/_count ) - {"count":643792946,"_shards":{"total":35,"successful ":35,"failed":0}}. As I see it, about half of the data is lost somewhere. How to restore them? Physically, the size of the database has not changed, as it was 200GB and remained.
If you execute the request - localhost:9200/_cluster/health?pretty=true , you see the following:

{
  "cluster_name" : "elasticsearch",
  "status" : "yellow",
  "timed_out" : false,
  "number_of_nodes" : 2,
  "number_of_data_nodes" : 2,
  "active_primary_shards" : 35,
  "active_shards" : 35,
  "relocating_shards" : 0,
  "initializing_shards" : 0,
  "unassigned_shards" : 35,
  "number_of_pending_tasks" : -1
}

Those. for some reason there is "UNALLOCATED SCHARDS".
Prompt how it is possible to restore normal work of base?
Previously, this problem happened to solve it, I deleted the entire database and restored it from a backup again, but this only helps for a while. All parameters of the elastic by default, except

curl -XPUT localhost:9200/_cluster/settings -d '{"transient":{"cluster.routing.allocation.enable": "all"}}'

Reply

Answer the question

In order to leave comments, you need to log in

2 answer(s)

A

Alexey Yamschikov, 2015-05-07
@mobilesfinks

It looks like Split-Brain
, do you have both nodes playing the role of a master? install the kopf and hq
plugin . We use both at work, but I personally like kopf. It usually shows what is with the shards and what condition they are in. We have never had a split-brain with two nodes. Now there are 4 nodes (3 masters), but there were no problems either. Such a problem (split-brain) is possible due to a bad connection.

M

mkuzmin, 2015-05-08
@mkuzmin

maybe it will help:
https://aphyr.com/posts/317-call-me-maybe-elasticsearch
habrahabr.ru/company/percolator/blog/222765
In terms of consistency, availability, and resiliency to network failures, Elasticsearch is a CP (consistency & partition tolerance) system for a rather loose definition of the term "consistency". If read-only operations predominate, Elasticsearch allows you to achieve AP behavior (availability & partition tolerance) by reducing the minimum master nodes parameter, that is, the lack of a quorum. However, it is usually necessary that the majority of the nodes in the cluster be available. Without this majority, writing to a misconfigured cluster, that is, a cluster with a "split brain" (split brain), can lead to irretrievable data loss. This is by no means specific to Elasticsearch and is common to other servers as well.
look in the direction of minimum master nodes
, maybe it should be equal to the number of servers