Cluster of 10 Atom D2700 vs one i7-2600?

Alexey Pomogaev2012-03-02 09:26:38

MongoDB

Alexey Pomogaev, 2012-03-02 09:26:38

There is a huge database on MongoDB, because of the large number of records, the data sampling time is not satisfied. It is clear that you need to do sharding. This is currently running on the i7-2600 processor, and sharding is planned to be done on the Atom D2700 (due to cost).
I looked at these tests www.cpubenchmark.net/cpu.php?cpu=Intel+Atom+D2700+... the difference in the tests (I haven't figured out what those tests are yet) is about 12 times.
So would it make sense to do sharding to get a significant gain in the selection of records if you replace one i7 with 10 atoms?
It turns out 40 cores (including HT) and 40 GB ram, against 8 cores (including HT) and 32 GB ram.
Yes, the samples are running in parallel, and the more of them in parallel, the better for performance gains. But of course, the time of one sample can be significantly reduced if it is done on the Atom D2700 ... I.e. 100 samples on i7 without sharding or doing 100 samples on Atom D2700 with sharding. What will ultimately work faster and will it be many times faster.

Answer the question

In order to leave comments, you need to log in

6 answer(s)

Alexander Chekalin, 2012-03-02
@achekalin

I'm not with an answer, but with a request - if you still do, write (better, of course, a post) about the results. The topic is very interesting, I'm thinking about it myself.

bdmalex, 2012-03-02
@bdmalex

What will ultimately work faster and will it be many times faster.
In my opinion, not a single theoretical calculation can say for your individual task which is better. Only launch and testing will reveal the truth...

Puma Thailand, 2012-03-03
@opium

And show your top from the server.
I don’t know why people give strange advice without a top.
Is it not enough protsa? How much do databases weigh in monge?
Is there one monga on the server or something else to twist?

karellen, 2012-03-02
@karellen

Cons of atoms:
1. Overhead on the system. On each of the 10 machines with 4 GB of memory, say 500 MB will be eaten by various system things. 3.5 GB of memory for mongo, even with a small database, will not be enough.
2. Overhead on the network. A good gigabit controller will decently load a weak atom with the network subsystem.
3. Replication and backups. 10 more atoms? It seems to be impossible to replicate from a sharded system to one large, but stupid server, for now.
4. 10 atoms are 10 times more likely to break something.
You can also come up with something else, I guess. Although 3 and 4 apply to any multi-server configuration, and in really large systems you just have to put up with them.
Purely speculative, we can advise:
1. Put the SSD in RAID1 of 250-500 GB and place the base on them. It will read almost like from memory.
2. Take not 2600, but 3930K and finish up to 64 GB of memory. Replicate somewhere weaker.

demark, 2012-03-02
@demark

Please clarify this point: in the question you write that you are not satisfied with the data sampling time , and in the comments write that all indexes are in memory. But after all, the index is just a “pointer” in which block on the hard drive the data is stored ... and the reading still takes place from the hard drive, which is 5-10 ms.
Have you thought about trying to store data on an SSD? - random access speed ~ 0.1 ms, which can be 2 orders of magnitude faster than HDD.

okazymyrov, 2012-03-02
@okazymyrov

And the DB lies on one disk or on several? Those. I want to say that what difference does it make how many processors you have if the throughput to the database does not change. Accordingly, performance will rest only on the width of the channel to the database.