U
U
un1t2016-06-02 11:38:56
Web development
un1t, 2016-06-02 11:38:56

Why does Elasticsearch replicas get out of sync?

I have two servers. The index size is not large (hundreds of thousands of documents), so I don't need shards.
But I need replicas for load balancing.
Not much data is added/updated per day - hundreds of documents.
But sometimes a problem arises, with one request to the first and second servers, different data is issued.
Config on the first node

cluster.name: mycluster
node.name: node-1

node.master: true
node.data: true

network.host: 0.0.0.0
transport.tcp.port: 9300
http.port: 9200

discovery.zen.ping.multicast.enabled: false
discovery.zen.ping.unicast.hosts: ["192.168.0.1", "192.168.0.1"]

On the second node, the config is the same except for two lines
node.name: node-2
node.master: false

curl -XGET ' localhost:9200/_cluster/health?pretty=true '
{
  "cluster_name" : "mycluster",
  "status" : "green",
  "timed_out" : false,
  "number_of_nodes" : 2,
  "number_of_data_nodes" : 2,
  "active_primary_shards" : 10,
  "active_shards" : 20,
  "relocating_shards" : 0,
  "initializing_shards" : 0,
  "unassigned_shards" : 0,
  "delayed_unassigned_shards" : 0,
  "number_of_pending_tasks" : 0,
  "number_of_in_flight_fetch" : 0,
  "task_max_waiting_in_queue_millis" : 0,
  "active_shards_percent_as_number" : 100.0
}

Write requests are sent to both the master and the replica.
1. Perhaps you should force the number_of_shards and number_of_replicas parameters, and re-create the index? Or it shouldn't matter.
index.number_of_shards: 1
index.number_of_replicas: 1

And I don't quite understand what numbers should be indicated here? 1 shard and 1 replica? What if there are 3 servers? - 1 shard and 2 replicas?
2. How can you understand that some kind of desynchronization has occurred?

Answer the question

In order to leave comments, you need to log in

1 answer(s)
U
un1t, 2016-06-09
@un1t

I figured out what it was. It wasn't a matter of synchronization. And that the sort order was not defined. Therefore, when requested, different nodes produced different results. And even with a request to one node, the result was different (apparently, elasxerch itself does the balancing of requests). Added sorting by id, the problem was solved.

Didn't find what you were looking for?

Ask your question

Ask a Question

731 491 924 answers to any question