A
A
Aidar Khayatov2018-11-27 08:02:04
Database design
Aidar Khayatov, 2018-11-27 08:02:04

How to choose architecture and database for a highly loaded system?

Good afternoon! I wanted to consult with experienced backend programmers.
The task was to create a highly loaded project (like a cash solution). In the database in 1-2 years, it is planned to have about 150 million records for the main entity (sale).
Accordingly, from time to time people will need to take reports, that is, they need to read this data as quickly as possible. But the big plus is that these 150 million records are divided into approximately 1000-5000 different users, and the selection is needed within one user only. What is the best way to store such data? in one table? or you can divide it into different tables, and keep a bunch of which user in which database stores.
I study a question of a choice of a DB now. I myself write on mysql - will it pull such volumes, on normal hardware. Or it is necessary to look towards other DB? If you think about scalability (increasing the number of records)
Thank you!

Answer the question

In order to leave comments, you need to log in

1 answer(s)
S
stratosmi, 2018-11-27
@Haiatov

Good afternoon! I wanted to consult with experienced backend programmers.
The task was to create a highly loaded project (like a cash solution). In the database in 1-2 years, it is planned to have about 150 million records for the main entity (sale).

150 million records is nonsense, not a high load solution.
I have 5,000 records per second created on a rather dead (something about 500 rubles a month hosting costs) VDS / VPS server
Two years? 150 million is in... 9 hours.
And yes, I do not consider this solution to be highly loaded.
loaded, yes.
No.
People don't need to report all data at once. Only part of the data is of interest to them.
If you still need all the data at once (well, some general statistics), then based on the primary data, aggregation is performed (for example, at night) and then reports will be built in general - instantly .
That's only if your 1000-5000 users will constantly receive data - only then can this be called a loaded solution.
It depends on what kind of data.
What exactly is the data.
MySQL is pretty fast.
For example, PostgreSQL is more functional. But about the speed - not necessarily.
How about looking at the official documentation?
https://dev.mysql.com/doc/refman/8.0/en/limits.html
150 million records for modern DBMS and modern computers (not even on "normal hardware") is ugh, not a burden.
PS:
There are various solutions for highly loaded reporting systems:
  1. Preliminary (nightly) data aggregation
  2. Master-slave, where master only updates data, and slave - only for reports.
  3. Specialized, tailored for a specific type of data DBMS (InfluxDB, Redis-Tarantool-Aerospike, ClickHouse, etc.)

Didn't find what you were looking for?

Ask your question

Ask a Question

731 491 924 answers to any question