R
R
RedQuark2013-06-23 19:34:35
open source
RedQuark, 2013-06-23 19:34:35

Setting up Elasticsearch for Russian

Problem: as soon as Russian letters are found in the data, the process of adding a record falls. Most likely, you need to create something like this instead of the default index ( link ):

curl -XPUT "http://localhost:9200/project/_settings?pretty=true" "{\"index\":{\"index.analysis.analyzer.english.language\" : \"English\",\"index.analysis.analyzer.russian.filter.0\" : \"lowercase\",\"index.analysis.analyzer.russian.filter.1\" : \"russian_morphology\",\"index.number_of_shards\" : \"1\",\"index.analysis.analyzer.russian.filter.2\" : \"stop\",\"index.analysis.analyzer.russian.language\" : \"Russian\",\"index.analysis.analyzer.russian.tokenizer\" : \"standard\",\"index.analysis.analyzer.english.type\" : \"snowball\",\"index.number_of_replicas\" : \"1\"}}"


but so far all options do not lead to the desired. I would be grateful if someone throws a script example of creating an index working with the Russian language.

Answer the question

In order to leave comments, you need to log in

1 answer(s)
V
vitalybaev, 2014-02-28
@vitalybaev

What version of elasticSearch and OS are you using?
We use the following plugin in our project to support Russian morphology:
https://github.com/imotov/elasticsearch-analysis-m...
Although I don’t remember that in the absence of it, the process of adding any Russians (and other UTF-8) fell characters.
Used ES from version 0.2.x up to 1.0.1 on Debian 6 Squeeze

Didn't find what you were looking for?

Ask your question

Ask a Question

731 491 924 answers to any question