Dealing with large amount of node mysql data?

S

Sratimon12018-09-12 04:56:16

MySQL

Sratimon1, 2018-09-12 04:56:16

The task is set simply, we get 50,000 +- numbers from the client in one piece. We check whether they are in the database and if not, we write them down. The database will initially store 10,000,000 records, each about 200 bytes. The client can also request these 50,000 numbers. The number is unique.
I am using node + express mysql database. ORM sequelize.
The question is the speed of these operations. And correct implementation.
1) I receive an array of objects from the user, further on this array I select data from mySql, find the numbers of which are not there and write them to the database.
Is it possible to compose such SQL if there is no number, then write it down, then return the id of the numbers? Or do you have any other suggestions how to make it faster?
2) If you keep the numbers online, it will be faster, but how to implement it?
It's easy to get a response from the database and write it to a variable, but when updating the database, you will have to make a choice again for the entire database, the information will change often.
3) Use something other than mysql. Then what options can you offer?
Ideally, I need to keep one table constantly online, because it will be the main selection / data record.
4) Your options. The customer is not limited by anything. At the moment there is a VPS with i7-6700 and 32gb RAM.
Ps I've been studying the web for 9 months in general (I had a base in C++, C++ qt, C# programming). But with a DB in such volumes did not work. The search engine mostly light articles for beginners. Either serious guys explain serious things and I don’t take it out. In general, I ask to help, whoever can.

Reply

Answer the question

In order to leave comments, you need to log in

3 answer(s)

A

Alexander Kuznetsov, 2018-09-12
@Sratimon1

Have you looked at 13.2.5.2 INSERT ... ON DUPLICATE KEY UPDATE Syntax ?
It is possible to interpose records and on the insertion conflict UPDATE will become. For 50k records, calculating those that do not exist without first loading their database, as I know, is not the fastest option.
Another option is to fill it with a buffer (a separate table for incoming data, each pack of which is characterized by a unique identifier) or a temporary table and then process them already at the database level, with stored procedures or just queries, but it is unlikely that there will be fewer operations than In the first case.
UPD. I forgot to write - you can insert in batches, like this:

INSERT INTO table (a,b,c) VALUES (1,2,3),(4,5,6)
ON DUPLICATE KEY UPDATE c=VALUES(a)+VALUES(b);

V

Vitaly Inchin ☢, 2018-09-12
@In4in

Maybe you should just take a closer look at Elastic

A

Andrey Shatokhin, 2018-09-12
@Sovigod

About INSERT ... ON DUPLICATE KEY UPDATE you have already been told. And it is right.
If you have a phone number in the table, a unique field and not changeable, I would consider making it the primary key in this table.
Make a selection according to the principle select * from table where phone in (5,8,123)