K
K
k2m302015-08-06 19:02:15
MongoDB
k2m30, 2015-08-06 19:02:15

Which database will work with a 1 petabyte table and 3,000 billion records?

There is a task - storage and search in logs:
timestamp - client IP - destination IP - destination URL
There are 10 billion of such records per day or 3000 billion per year. At the same time, there are no serious requirements for the load - several times a month you need to find the necessary records.
A simple web interface is supposed to create a request, but you need to decide on the base. What base will cope with this task?

Answer the question

In order to leave comments, you need to log in

6 answer(s)
A
Andrey Burov, 2015-08-06
@BuriK666

SQLite

I
Ilya Erofeev, 2015-08-07
@imerofeev

It might help. Recently, at a local conference, they told how Avito stores logs:
They don’t just store there, but monetize all this data through ad targeting, etc.
The report was interesting, it’s a pity the video recording was not preserved.
Here is another link to the presentation with this report.

V
Vlad Zhivotnev, 2015-08-06
@inkvizitor68sl

mongo + sharding.

I
Igor Vorotnev, 2015-08-06
@HeadOnFire

Have you heard about Big Data? On such a scale, forget about the classic database.

V
Vitaly Pukhov, 2015-08-07
@Neuroware

Firstly, you need to decide what exactly you need to look for in this heap, that is, what to dance from, it is one thing if for a given IP to find all destination URLs, it is quite another to find all the IPs that came to this destination URL. The storage architecture in both cases should be different. In any case, tasks of this magnitude are solved by professionals with the appropriate qualifications. "According to the recommendation on the toaster" such things are not done, at best, the self-propelled gun will slow down hellishly and do a "search" for years, at worst, at some point you will lose data.

P
Philipp, 2015-08-07
@zoonman

You can read it yourself kkovacs.eu/cassandra-vs-mongodb-vs-couchdb-vs-redis
I would recommend looking towards Hadoop kids like HBase or Cassandra.
But MongoDB will do just fine.

Didn't find what you were looking for?

Ask your question

Ask a Question

731 491 924 answers to any question