S
S
Slader2013-04-14 13:10:55
NoSQL
Slader, 2013-04-14 13:10:55

What to change MongoDB to

Now the server (2xE5-2630, 128GB RAM, SAS) has MongoDB 2.4.1
It has two collections (in different databases):
- Collection-1: 70 million records (120 insert/s, 60 find/s, database with indexes ~ 9 GB).
- Collection-2: 40 million records (50 insert / s, 40 find / s, base with indexes ~ 11 GB).

The numbers do not seem to be large for this iron. In fact, there are an order of magnitude more requests for reading, but for the rest, information is taken from the cache (Redis). It reaches mongo only in case of a cache miss.

The problem is that multiple find operations are constantly running in 180 - 600ms . Judging by the profiler reports and logs, they are waiting until the writing lock is released.
Sun Apr 14 13:40:44.859 [conn3581] query db1.coll1 query: { _id: "14638g27189a6a957c6a792151df31b7" } ntoreturn:1 idhack:1 keyUpdates:0 locks(micros) r:188697 reslen:105 188ms

To basis generally I do only findOne or insert. The amount of data in insert does not exceed 120 bytes.

iostat -x
rrqm/s wrqm/s r/s w/s rkB/s wkB/s avgrq-sz avgqu-sz await r_await w_await svctm %util
0,01 5,55 0,57 27,60 35,60 1531,59 111,28 0,41 14,46 8,01 14,60 1,61 4,53

Question: what to do? In the future, the amount of data will grow to 500 - 1000 million records. And the number of reading accesses is up to 10 thousand / sec.
I need a very low latency DB to read by key. No complex selections. Need backend for cache. And mongo with its per-database write lock spoils everything.

Another option is to read only from the slave node. And write on master. But I'm afraid to encounter delays in data replication, and, as a result, inconsistent data.

What do you advise? hbase, hypertable? It is necessary to keep within a maximum of 2-5 ms when reading from the database.

Answer the question

In order to leave comments, you need to log in

18 answer(s)
S
Slader, 2013-04-14
@Slader

Because 1 billion keys is already 166 gigabytes. It is quite normal for a server with 256GB.
But if we also store data for each key, then we won’t fit in in terms of volume. We have a maximum of 512GB can be installed on the mother.
You can, of course, do user-level sharding for redis or use memcached. We are thinking about this now.
On the other hand, there is no need to store the entire billion in memory at all. Of it active - 100 million maximum. And still it would be desirable to store the rest in a DB. And if you suddenly need to - quickly subtract and put in the cache.
Perhaps we should switch to another type of database? We need a delayed write (only insert) and fast (no more than 2 ms) reading. SSD are ready to put, if only it helps.
After all insert becomes at first in a cache. And all nodes take data from the cache. So it doesn't really matter how long the database writes the data.
But the minimum delay in case of a cache miss is very important.

A
amakhnach, 2013-04-14
@amakhnach

look towards Cassandra DB

B
BlessMaster, 2013-04-14
@BlessMaster

Have you tried playing with fsync?

Y
Yuri Shikanov, 2013-04-14
@dizballanze

Why not completely switch to Redis? Memory is cheap these days.

N
necromant2005, 2013-04-15
@necromant2005

The global problem with the captions is that each insert forces the indexes to be rebuilt.
Therefore, the only possible solution to the problem radically is to break the base (sharding) into parts, which leads to the fact that record opreces are distributed for all shards (preferably uniformly, depends on the key selection algorithm) and as a result:
number_of_records_on_1_node = total_number_of_records/
number_of_nodes second and 100 nodes - 10000/100 = 100 write operations per second.
As if there are no other ways of scaling the record.
opium - correctly suggested, the easiest option is sharding inside the mongi itself (this will only block part)
Cassadra / Riak would probably be more suitable, but again, cluster solutions: more nodes - higher performance.
Well, as an oddity: living on the same server - with gaps in the record will not work.

E
Emmaseven, 2013-04-18
@Emmaseven

Mega performance storage Key => Value fallabs.com/kyototycoon/

B
boodda, 2013-04-23
@boodda

With such hardware, why not use MySQL or Postgre with data partitioning in 1M-10M blocks, making the ID incremental BIGINT the primary key and the data field, fixed size (fixed), then the search will essentially come down to choosing the desired section by iD and choosing the desired record using the formula id*row_len. This will work very fast even from disk, provided that the table files are not physically fragmented on disk, but if from memory I don’t think it will be inferior to Mongo. But of course it is necessary to
test Permanent connections will be required here, I think.

F
flyaway, 2013-10-17
@flyaway

www.tokutek.com/products/tokumx-for-mongodb/
Shrinks the database by about three times, much better for writing, mongos works much better

M
meeshaeel, 2013-11-15
@meeshaeel

Try Aerospike http://www.aerospike.com
Preferably have machines with more RAM and SSD
Community edition limits database size to 200GB

S
Slader, 2013-04-14
@Slader

Maybe someone used Berkley DB on an SSD? Tell me then, how is it?
Find and insert are needed. Delete - not at all

R
realduke, 2013-04-14
@realduke

Maybe look at PostgreSQL, on the other hand, especially considering that you have only one server.
There is such a presentation - wiki.postgresql.org/images/b/b4/Pg-as-nosql-pgday-fosdem-2013.pdf . Of course, the tests decide, but still.
Personally, MongoDB has always seemed to me a crutch solution. Beautiful API, enough features. But here's how it comes to operation, a bunch of flaws constantly climb out. Acquaintances responded that hemorrhoids were also administered.
Riak + Redis is better, when you need a lot of nodes, and, accordingly, you have all the pros and cons of dynamo-style storage.

S
sowich, 2013-04-14
@sowich

Perhaps orientDB will show a good result.

P
Puma Thailand, 2013-04-15
@opium

Can you do find through some kind of sphinxsearch?

I
Iliapan, 2013-04-15
@Iliapan

I don’t know exactly why the brakes in mongo, but in the muscle your task is solved through a database with the archive engine. On such a system, it will calmly digest your load and even remain for a dozen memory virtual machines. You missed the platform a lot, followed the fashion, now pay the price :)

T
Tenkoff, 2013-05-15
@Tenkoff

LevelDB

V
Vitaly F., 2014-04-08
@FuN_ViT

IMHO you have not the right architectural solution. Just add a replica to ReadOnly and read only from it. And goodbye write lock...

A
Adam_Ether, 2014-05-14
@Adam_Ether

It seems to me that you need to look towards Riak (class dynamodb and the like).
But, as it was correctly noted above, their advantage can only be felt when used in a cluster, and using one instance is like shooting sparrows from a cannon.
see more
Here is a good answer on how to migrate everything, for example to AWS DynamoDB
www.masonzhang.com/2013/07/lean7-migrate-from-mong...
news.dice.com/2013/02/21/why-my -team-went-with-dyn...
blog.cloudthat.in/5-reasons-why-dynamodb-is-better...

K
kolofut, 2014-08-15
@kolofut

Look at ElasticSearch, although it is positioned mainly as a full-text search engine, it feels great in the role of NoSQL DB. We have been using it this way and are very happy with it. Here is an example of such use, habra translation . It is especially convenient that JSON is used for requests to elastic, after mongo it will be familiar (and also convenient)

Didn't find what you were looking for?

Ask your question

Ask a Question

731 491 924 answers to any question