What technology to choose for working with the database?

N

NickBor2021-04-09 16:23:57

Database

NickBor, 2021-04-09 16:23:57

Good afternoon! Help answer this question related to databases. It is planned to create a database with a volume of about 100-300 GB (if we talk about rows, then several billion rows in the table). For example, data from many sensors that come every hour for years, and then you need to work with this data. It is necessary to make requests to upload data. What is the best thing to do, organize a database with clustering of this data and use PostgreSQL or apply any methods of working with BigData? If PostgreSQL, then what orders of query processing time can be feasible? (Let's say you need to upload a table for 100,000 rows)? Thanks for the advice!

Reply

Answer the question

In order to leave comments, you need to log in

5 answer(s)

S

Sergey Gornostaev, 2021-04-09
@sergey-gornostaev

It is impossible to give a specific answer to your question in this formulation. Much depends on the structure of the database, scenarios for working with it, the intensity of requests, the ratio of read requests to write requests, etc. etc. Most likely, this is not a question for the DBMS, but for the architecture of your application. In general, in one of my projects, requests to read by index from a table with 23 billion records fit into tens of milliseconds with standard settings and a not very powerful server.

V

Vladimir Kuts, 2021-04-09
@fox_12

If you have a large amount of sensor readings with timestamps, then you should take a closer look at specialized databases

K

ky0, 2021-04-09
@ky0

Huge tables of the same type and time-ordered data are bad manners. Partitioning tables by day / month and collapsing old data into aggregates should help - this is exactly what happens in the Zabbix database, for example.

S

Saboteur, 2021-04-09
@saboteur_kiev

It is unlikely that they will say for sure.
Request speed depends on everything. From the size of a particular table, from what kind of data, and it is important both the type and their similarity, so that it is easier to build indexes.
And from the speed of ssd
But I would say that 100-300 GB is far from big data. This is just a large database that both mysql and postgres can handle, especially since taking a server with 128GB of RAM, almost a third can be cached in memory.
In your case - without performance tests, no one will even say an approximate order.
PS And yes, if your data is mostly numbers (timestamp and numeric exponents, then time series data bases can handle that better). On the other hand, they are not as popular and perhaps not as developed.

R

Roman Mirilaczvili, 2021-04-09
@2ord

Try one of the TSDBs as pointed out by Vladimir Kuts .
Specifically, the VictoriaMetrics repository . PromQL is used as the query language.

Can be used as a long-term data store connected to Prometheus and Grafana.

As for the advantages in relation to others, then go to the link above.