N
N
nicolaa2020-11-13 07:54:10
elasticsearch
nicolaa, 2020-11-13 07:54:10

How to properly index a large amount of information in elasticsearch?

I created a command in Laravel, in it I get all the goods I
$prod = Prod::cursor();
sort through and write to elasticsearch

$this->elasticsearch->index([
  'index' => 'prods',
  'type' => 'prod',
  'id' => $prod->prods_id,
  'body' => [
    'site_id' => $prod->site_id,
    'category_id' => $prod->category_id,
    'fields' => $prod->field,
    'priority' => $prod->priority,
    'prod_updated_at' => $prod->updated_at_p,
    'prod_created_at' => $prod->created_at_p,
    'site_rating' => $prod->site_rating,
    'prod_rating' => $prod->prod_rating,
  ],
]);

There are more than 2 million records in the database.
On average, 23,000 products are recorded in elasticsearch per hour, it turns out that it will take about 4 days to write the entire database.

Is it possible to record in a faster way?

Answer the question

In order to leave comments, you need to log in

3 answer(s)
R
Roman Mirilaczvili, 2020-11-13
@2ord

Although I didn’t work much with ES, it seemed strange to me that the index needs to be built on so many fields. What is the practical meaning of this? Why not store in ES only what needs to be searched and / or analyzed? After all, for other cases, you can use a relational DBMS.
Mash. translation about indices and power

F
Flying, 2020-11-13
@Flying

Use the Bulk API , through which loading is ten times faster than uploading one document at a time.
True, throwing everything in one bulk is also not worth it, practice shows that packages of about a hundred documents at a time are quite enough, although it also depends on the size of one document.

V
Vitaly Karasik, 2020-11-13
@vitaly_il1

There are more than 2 million records in the database.
On average, 23,000 products are recorded in elasticsearch per hour, it turns out that it will take about 4 days to write the entire database

I have two questions, one of which is slightly off-topic
1) what server is this running on and what does top show?
2) why are you not satisfied with 4 days for the initial filling of the database? Are you going to keep adding new entries?

Didn't find what you were looking for?

Ask your question

Ask a Question

731 491 924 answers to any question