Database for billions of records and quick fetches

cawabanga2012-10-05 10:45:08

MySQL

cawabanga, 2012-10-05 10:45:08

Hello.

There is a task: to organize the storage of several (about 5) billion records that can be slowly updated, but you need to quickly select. The scheme is multidimensional, i.e. each such record is related to others through foreign keys, which also participate in the selection criteria.

For example, let these be cars for sale / rent, all their characteristics are scattered across other tables, you need to search for them. There are many cars . MySQL doesn't even handle this very well with indexes .

What to do? CouchDB? Hadoop? Or is it just a good design?
Still, not such a big number, this billion.
Not enough money.

Answer the question

In order to leave comments, you need to log in

11 answer(s)

antarx, 2012-10-05
@cawabanga

Sharding and denormalization of data, the database is mostly a matter of taste.
That is, minimize external dependencies and keep track of them at the application level. It is better to store small tables entirely in some kind of memory storage (application cache, nosql - it doesn’t matter). Next, explicitly separate the data by the main key (say, the number of the item being sold), and store it in different databases. If it suddenly turns out that non-service operations require selections that are not related to the main key, you are either doing something wrong, or storing exactly this data in another database.

gleb_kudr, 2012-10-05
@gleb_kudr

PostgreSQL.

Alexey Huseynov, 2012-10-05
@kibergus

Denormalization -> no need to do JOINs -> possibility of abandoning SQL -> possibility of horizontal scaling -> profit

Urvin, 2012-10-05
@Urvin

MS SQL is very expensive?

Iskander Giniyatullin, 2012-10-05
@rednaxi

I would advise using MySQL or anything else that will support quick selections by primary key, and search by parameters through special tools - for example, the same sphinx.
You just index your database with the sphinx, look for the c / s sphinx, it returns the Id of the record, by the id you quickly pull out the content from MySQL.

ToSHiC, 2012-10-05
@ToSHiC

Since you rarely update, make several slaves, the load on each is divided in proportion to their number.

1nd1go, 2012-10-05
@1nd1go

Riak is recommended for this.

Puma Thailand, 2012-10-05
@opium

Give examples of tables and queries to mysql a
billion records is not so much.

Dmitry, 2012-10-06
@Neir0

For the market, document-oriented databases are great for this. They just have a slow write, but very fast reading. Accordingly, no foreign keys are needed, all information about the car will be stored in one document. I myself worked with RavenDB, but I think this is not the best option for you, you can look towards MongoDB. Billions of records are not a problem.

betal, 2012-10-05
@betal

mysql slave,
nosql

shagguboy, 2012-10-05
@shagguboy

you need bitmap indexes. MySQL doesn't have them.