Redis vs SQLite vs PostgreSQL

HiltoN2012-05-03 19:25:40

PostgreSQL

HiltoN, 2012-05-03 19:25:40

I decided to compare the performance of these databases for several large flat tables and simple queries (by key, by index, etc.). Introduced the same subject area - the message queue: for SQLite and PostgreSQL these are the same schemes, for Redis sorted lists were used, because other storage options were not suitable for the described task. Queries: search for a message by key, deleting a message by key from the queue, fetching messages older than n minutes, fetching and deleting a message from the head of the queue.

10 million lines, all databases occupy almost 2 GB each (unfortunately there was no more RAM). Result: if the database fits into memory, then the speed of query execution is approximately the same for different "DBMS" (simple operations - about 3000 operations per second, more complex - about 600). What is the beauty of Redis then? I understand that it is suitable for narrowly focused tasks, for example, only search by key, i.e. for limited caches. Otherwise, there are only minuses: make sure that the database fits into RAM (PG just slows down by switching to disk reading, Redis starts swapping); and the set of commands and data types is limited.

PS: What do you use for reliable permanent data storage and fast execution of simple queries? Holivar PG vs MySQL can be omitted, there will be practically no difference between them.

Answer the question

In order to leave comments, you need to log in

8 answer(s)

Maxim Avanov, 2012-05-03
@HiltoN

1. Redis has a better representation of working with collections. A simple example is an incremental counter. You do incrby/hincrby for any key, without caring about its presence in the store. In Postgres, the similar sequence-based functionality (nextval('foo')) assumes that you've already created the sequence 'foo' before. This encourages you to write procedures that, before trying to change the counter, first check for its existence, create it if necessary, and only then change it. More handmade.
2. Data structures in Redis are optimized either for fast search O(1), or for compactness and acceptable arbitrariness O(N), O(log(N)). You can almost always get by with simple or nested hash tables in O(1) or O(n). In Postgres, you almost always use some kind of B/R-tree, GiST/GIN indexes with complexity O(log(N)(+N)). Prior to version 8.4, HASH indexes in Postgres had almost the same search speed as B-trees, so their use did not make any sense. Now, in version 9.1, it makes more sense, but not much - HASH indexes do not support Write-Ahead Log and require manual reindexing in case of failure:
"Hash index operations are not presently WAL-logged, so hash indexes might need to be rebuilt with REINDEX after a database crash. They are also not replicated over streaming or file-based replication. For these reasons, hash index use is presently discouraged. " http://www.postgresql.org/docs/9.1/static/indexes-types.html
In my projects, I use both Redis and Postgres. The first is as an effective system for collecting online statistics (counters likes, various metrics), and the second as a repository for user accounts and content with its meta information.At the same time, there has been a tendency to transfer content to HBase, leaving Postgres only the tasks of ACID servicing operations with user accounts.

CKOPOBAPKuH, 2012-05-04
@CKOPOBAPKuH

Hammer vs Sledgehammer vs Screwdriver
I decided to find out which tool is better. He presented the same task - to hit himself on the big toe. I decided to hold the screwdriver by the handle and hit with the tip, since it is inconvenient to hold the tip and hit with the handle. For a hammer and a sledgehammer, these are the same schemes. Requests: hit the thumb and measure the time how much it hurts.
Result: if it hurts to hit, then the finger hurts. What then is the charm of a screwdriver? I understand that it is suitable for narrowly focused tasks, for example, only unscrewing or twisting, i.e. for limited tasks. Otherwise, there are only minuses: it’s uncomfortable to hold, and the area of damage is small, and I hit my finger only the third time.
PS: What do you use for reliable permanent finger tapping? Holivar classic Russian hammer vs ikea hammer can be omitted, there will be practically no difference between them.

Stdit, 2012-05-03
@Stdit

In addition to RDBMS, we use MongoDB. A wonderful and fast thing that allows you to store collections of trees of any shape, build indexes on any of their nodes, easily scales horizontally, has a fairly powerful system of read and update queries. The disadvantage is the lack of joins, problems with aggregation, they are solved by preliminary aggregation when data changes or by recounting the crown.

dborovikov, 2012-05-03
@dborovikov

> for reliable permanent data storage
Here, beware, as far as I know, almost all NoSQL are semi-persistent, that is, they do not give 100% guarantees for safety. Postgres easily turns into a kind of NoSQL if you disable synchronous commit - safety guarantees decrease, speed grows by about two orders of magnitude.
UPDATE is an expensive operation for the same postgra. Probably in memory bases are able to perform this operation quickly. So for all kinds of counters, redis is a great thing.

Alexey Sundukov, 2012-05-04
@alekciy

>What is the beauty of Redis then?
The data is always in RAM, while, as in the RDBMS, the index can be forced out to disk, while data in large volumes is almost always on disk. Therefore, we get a performance drawdown due to I / O on the disk. In Redis, this is basically not the case, as far as I remember, they have now abandoned virtual memory. About O (1) Ghostwriter has already said. ACID still brings its overhead.
Well, from a similar group of software, it is distinguished by conditional persistence.

HiltoN, 2012-05-04
@HiltoN

If possible, I'll rephrase the question: tell me a fast persistent base, which, preferably, can be mapped entirely into memory, provide high reliability in the event of a system crash, work on one machine, support multithreading, transactional support is desirable. The rest of the chips are not important.
In my opinion, all three systems described in the subject fit the definition (if not, correct it). What else can you advise?

gro, 2012-05-04
@gro

where are the tests?
Have you tested parallel queries?

Begetan, 2012-05-08
@Begetan

1. What 3000 requests per second are we talking about? On a test Intel Atom 1.6 GHz in a PHP script, I received 12,000 requests per second GET / SET
2. It is not clear why you did not use Lists, which in fact are a ready-made queue implementation?