P
P
Pavel Talaiko2019-12-08 10:54:26
PostgreSQL
Pavel Talaiko, 2019-12-08 10:54:26

How to make optimization in adding data via (spring, hibernate, postgres)?

Good day, friend.
It is necessary to add a very large amount of data and exclude repetition across the field. unique is clear.

Set<T> records= new HashSet<>();
for (T item: data) {
      final Boolean exists = repository.existsByTitle(item.getTitle());
      if (!exists) {
          records.add(item);
      }
}        
repository.saveAll(webResourceEntities);
repository.flush();

How to optimize these queries. What are the adding techniques? Who will tell that super cool. There are millions of records. And every second, tens of thousands are added. How to work?

Answer the question

In order to leave comments, you need to log in

2 answer(s)
M
Melkij, 2019-12-08
@melkij

insert on conflict do nothing
Perhaps view with instead of statement trigger, in view write through copy, the trigger will do insert on conflict

S
Sergey Gornostaev, 2019-12-08
@sergey-gornostaev

The option proposed by Melkij is much more efficient, but Hibernate does not know how to do this, and if you want a solution at the ORM level, then select a set of headers from the database, get the difference with the set of inserted records, save the resulting value in the database, after selecting the effective value of the parameter hibernate.jdbc.batch_size. Naturally, there titlemust be an index.

Didn't find what you were looking for?

Ask your question

Ask a Question

731 491 924 answers to any question