How to get more than 1,000,000 records using Elasticsearch?

P

P7472019-11-11 03:08:06

elasticsearch

P747, 2019-11-11 03:08:06

Good afternoon!
More than 1 million products are stored in the Elasticsearch catalog. I pull the id of the goods from the elastic, if I pull out ± 10,000 products, then it's normal, but for example, if I pull out a larger number, then the RAM is eaten up accordingly.
Is there a way to pull 1 million products in blocks, for example, 1,000 for 10 blocks. That is, we take the records of the block, carry out procedures on it, clean it in memory and move on to the next block, etc.?

Reply

Answer the question

In order to leave comments, you need to log in

2 answer(s)

F

Flying, 2019-11-11
@Flying

You can select an arbitrary number of results from Elasticsearch using the scroll API .
This is more or less analogous to cursors in databases. Thus, by determining the size of the window and the lifetime of the token, you can adjust the selection to suit your tasks.
In my project, I had to iterate through scroll and more than a million documents - it works fine.

I

Ivan Shumov, 2019-11-11
@inoise

To begin with, remember that the elastic is the most voracious creature in this world in terms of resources. Then remember that elastic is a search index, not a database, and getting more than a dozen (conditionally) records from it is considered a crime. Then we learn that the elastic has limits on the maximum length of the index for one request and it is configured (read about indexes). Stop doing it because there are hardware limits anyway