Answer the question
In order to leave comments, you need to log in
How to organize in Linux with 10,000,000,000 (billions) inodes, fast access to them and their processing (Linux database replacement)?
You need to search and insert billions of records. I tried elasticsearch - after 50,000,000 records, it's hard to insert. I also thought about trying Cassandra. But I thought about why for elementary actions to take such machines - huge and clumsy. Yes, they are all well thought out and cool scalable. The problem in all this is universality, as in all mass products. As usual, you have to pedal.
There is data of the form:
Хэш Инфо
aDs3g9 2:1,2,4;11:1 (где 2, 11 - ключи, а 1,2,4 и 1 - поля к этим ключам, остальное - разделители)
3trhn 2:9,7;3:3,4
aD/s3/g9/aDs3g9
aDs3g9 "2:1,2"
- here the question arises: " Should I create all possible folders before the inserts or create them during the inserts? ". Let's say we have folders, then we insert key 2 and data 1.2. aDs3g9 "2:4"
- we see there is already a file in the folders, we read it, we drop it to the right place "4" and we get "2: 1,2,4" aDs3g9 "11:1"
- we get "2:1,2,4;11:1"
"2:1,2;2:4;11:1"
aDs3g9, 3trhn
(here a list of up to 1000 hashes can come) - return "2:1,2,4;11:1 2:9,7;3:3,4"
Answer the question
In order to leave comments, you need to log in
The classic fs is not suitable for this, if your data size per "hash" is small, for example, up to 100 bytes, then just make a large 400GB file and write the data by index, while the hash is not needed. With a normal ssd, it will be possible to write up to 1M records per second. regular script. In this case, 75% of the space will be "idle". If you want to save space, then you need to use an index, for example, use leveldb or the like.
Didn't find what you were looking for?
Ask your questionAsk a Question
731 491 924 answers to any question