A
A
andrew_tch2016-09-28 17:49:22
NoSQL
andrew_tch, 2016-09-28 17:49:22

In which database to store one large table?

Hey!
There is data of the same type, but there are a lot of them. In fact, these are events from the metrics system - the event has a timestamp, and from the heels of the description fields. Events have only one type, in fact they are JSON documents.
You need something where you can store events (in fact, this is one table), there is a lot of data (hundreds of millions, billions of rows) and make a selection in a reasonable time (for example, select all records with host = foobar, or with type = emergency ).
Where is the best place to store this? mongodb? Riak? Couchdb? Plain old SQL?
Thank you!

Answer the question

In order to leave comments, you need to log in

3 answer(s)
S
sim3x, 2016-09-28
@sim3x

Decompose Postgresql
json into tables (normalize -- get rid of duplicate text values), make indexes, if necessary, make partitioning

L
lega, 2016-09-28
@lega

Yandex has a clickhouse for metrics, you can try it.
What hardware do you want to use for this business?

billions of lines
for example, 10 billion can take 600GB on disk, one index 100-600GB (RAM), more indexes - more memory. Those. here standard approaches do not rule.
For this, I did partitioning + compression + pre-caching, in total 600GB would turn into 10GB on disk and 0.4GB on indexes. Well, the sampling speed has increased by ~100 times compared to the usual approach (select by rows from the table).

R
Roman Mirilaczvili, 2016-09-28
@2ord

Try to search on the topic "time series database" (TSDB): maybe all sorts of Akumuli , InfluxDB and others.
These RDBMSs are better at handling large amounts of time series data than regular RDBMSs like Postgres .

Didn't find what you were looking for?

Ask your question

Ask a Question

731 491 924 answers to any question