Answer the question
In order to leave comments, you need to log in
How to choose a repository of 3 trillion events?
You need to choose a storage that will receive a large volume of the same type of events (up to 3 million per second).
A storage depth of 1 month is approximately 3 trillion events.
The selection of events will occur using filters by fields on average once per second.
Accordingly, the storage should be able to scale horizontally by 100-1000 nodes, be a reliable and proven solution, be resistant to node failures, make quick selections according to various criteria with the ability to sort, support the java client.
Answer the question
In order to leave comments, you need to log in
Yandex Clickhouse (only for Yandex, but just for the task)
Aerospike
can still be started with DynamoDB, everything is ready there, just pay
only from 3 trillion and 5-10 seconds I’m not very sure, one way or another you will have to preprocess something
Tarantool and AeroSpike? Or perhaps it is worth looking towards the time series database?
https://www.influxdata.com/influxdb-vs-cassandra-b...
Maybe cassandra can handle an insane amount of servers, but in general, more than a million records per second is currently poorly implemented.
The speed of ssd is up to 550Mb / sec, if the events are 20b each, then you can pour ~ 27 million events per second into files (one channel is not enough to load)
The selection of events will occur using filters by fields on average once per second.Pour into the "length" of the filters and there will be norms.
Akumuli can record 4.5 million events per second on a single m3.2xlarge instance (if the events are represented as a combination of a set of tags, a timestamp and a float).
Didn't find what you were looking for?
Ask your questionAsk a Question
731 491 924 answers to any question