Answer the question
In order to leave comments, you need to log in
Database for billions of records and quick fetches
Hello.
There is a task: to organize the storage of several (about 5) billion records that can be slowly updated, but you need to quickly select. The scheme is multidimensional, i.e. each such record is related to others through foreign keys, which also participate in the selection criteria.
For example, let these be cars for sale / rent, all their characteristics are scattered across other tables, you need to search for them. There are many cars . MySQL doesn't even handle this very well with indexes .
What to do? CouchDB? Hadoop? Or is it just a good design?
Still, not such a big number, this billion.
Not enough money.
Answer the question
In order to leave comments, you need to log in
Sharding and denormalization of data, the database is mostly a matter of taste.
That is, minimize external dependencies and keep track of them at the application level. It is better to store small tables entirely in some kind of memory storage (application cache, nosql - it doesn’t matter). Next, explicitly separate the data by the main key (say, the number of the item being sold), and store it in different databases. If it suddenly turns out that non-service operations require selections that are not related to the main key, you are either doing something wrong, or storing exactly this data in another database.
Denormalization -> no need to do JOINs -> possibility of abandoning SQL -> possibility of horizontal scaling -> profit
I would advise using MySQL or anything else that will support quick selections by primary key, and search by parameters through special tools - for example, the same sphinx.
You just index your database with the sphinx, look for the c / s sphinx, it returns the Id of the record, by the id you quickly pull out the content from MySQL.
Since you rarely update, make several slaves, the load on each is divided in proportion to their number.
Give examples of tables and queries to mysql a
billion records is not so much.
For the market, document-oriented databases are great for this. They just have a slow write, but very fast reading. Accordingly, no foreign keys are needed, all information about the car will be stored in one document. I myself worked with RavenDB, but I think this is not the best option for you, you can look towards MongoDB. Billions of records are not a problem.
Didn't find what you were looking for?
Ask your questionAsk a Question
731 491 924 answers to any question