B
B
bio2017-02-17 08:25:49
elasticsearch
bio, 2017-02-17 08:25:49

How to update/reindex properly?

Hello!
How to properly update/reindex documents when index settings change (add or change settings in analysis), or when mappings change?
At the same time, it is possible that documents may be added/changed during reindexing.

Answer the question

In order to leave comments, you need to log in

2 answer(s)
Z
Zakharov Alexander, 2017-02-17
@AlexZaharow

It seems like if you change analysis or mapping, then only a complete re-indexing is required (Exception if you added something to mapping, but did not change or cancel it). Unlike SQL/NoSQL databases, ES does not reindex data within itself, even if you store it there.
PS
Sorry, but it seems to me that you have just started to study it. This is its feature. This is not a database, although it can store data in itself. This is a significant difference from the usual, when the data is separate, the indexes are separate.

V
Vadim Stepanov, 2017-02-23
@Vdm17

The first thing that comes to mind is to use the option described here: https://www.elastic.co/guide/en/elasticsearch/guid...
A well-described option is how to re-index without stopping the application.
About what to do with documents that can be changed during the reindex. There are several options.
You can use the option proposed by Alexander Zakharov - an additional field with the date the document was updated. In this case, after the reindex, you will need to make an additional pass through the old index and find documents with different dates.
The second option is to use document versions instead of dates. But! To do this, documents must be transferred during reindexing along with version numbers (otherwise, they will all be added with _version=1).
The third option - a combination of the first two - use your own special field for document versions and tell ES about it - it has support for using "external versions".
And in the second and third options, anyway, after the reindex, you need to go through the old index and determine which documents have changed.
The worst case that you can get: you are doing a reindex, at this time document X is changed in the old index. The reindex ends, the search for changes begins. At this time, document X is changed again, but in a new index. In the process of searching for changes, you see that the document X was changed in the old index and it needs to be moved to the new one - and here it is also changed again. What to do in this case is up to you. It all depends on the importance of the data and whether the changes can be merged.

Didn't find what you were looking for?

Ask your question

Ask a Question

731 491 924 answers to any question