Answer the question
In order to leave comments, you need to log in
Which database is best for storing a large dictionary?
It is necessary to store about a billion pairs of a four-byte key-string of text with the fastest possible selection of text by key.
I don’t want to fence the bike if there is a ready-made solution.
I read the answers, the question clearly needs to be supplemented.
The database should work on the average user's computer. That is, clusters and stuffing entirely into RAM disappear.
Sqlite and mysql are quite slower than the desired result.
Answer the question
In order to leave comments, you need to log in
DynamoDB as a cloud service.
Redis cluster - as an option.
With a 4-byte key, you will be limited to ~4 billion.
If you need the fastest possible selection, then in-memory Key-value database . Redis, Hazelcast, etc.
If not as fast as possible, not enough memory, then I would prefer to store data in a regular relational database table.
Have you tried adding an index to a 4-byte field?
On MySQL indexes work quite well, as long as it is the first index created on a table, then the InnoDB ( default ) engine will create a clustered index by default.
Here is the syntax: https://dev.mysql.com/doc/refman/8.0/en/create-ind...
note the UNIQUE options, this will help confirm that each text field's key is indeed unique.
Then, during the query, it will be necessary to carefully check the syntax to confirm that the query is created so that the index will be used - the data type in the WHERE must match, and that the query actually uses it (in my opinion in MySQL this is the EXPLAIN option).
If everything is done correctly, then the speed of a query with a billion records should be quite acceptable.
It remains to check some details. From the question it is not very clear if the database is multi-user, or serves a user working on the same machine, whether there is simultaneous access by several users, in other words, additional information on the scale of the use of the database can help. And although SQLite hints at the purely local nature of the records, these are details that are better confirmed than left out.
In addition, what is the nature of the key of 4 bytes, is it a number or a binary construction, if it is binary, is it acceptable to convert it to the Integer type, this is essential for the speed of the index.
---
If the index has already been created and is not showing the results you expect, the details requested above will help sort it out.
Didn't find what you were looking for?
Ask your questionAsk a Question
731 491 924 answers to any question