I
I
Imaginarium2015-12-09 13:05:13
Distributed Computing
Imaginarium, 2015-12-09 13:05:13

How to configure a computing cluster?

Hello.
How to accurately select the hardware configuration of a cluster in a laboratory designed for modeling tasks (CFD) and machine learning?
I myself present the solution as follows: the
cluster consists of blades, while in one rack, each blade has on board:

  • CPU: on Xeon, type E5-2667 v3 (Haswell) 2 to 4 per board;
  • Memory: no less than 128 GB, selection of specific brackets for the motherboard;
  • GPU: Nvidia Tesla K80, 2 per board;
  • Network: Infiniband for communication between nodes;
  • SSD - you don't need much for each node, 250 GB is enough;
  • The rest is controlled nutrition and so on. - already to the counter.

But all this I just assume, going from the required performance. What else do I need to consider? There are no issues with the power supply, because. there is a separate server room.
Do I need to somehow take into account the characteristics of the OS for its configuration? I'm assuming using RHEL/CentOS on it.
Surely I can’t imagine even 10% of the pitfalls associated with configuring and configuring, please share your experience, what else is useful to think about before buying hardware?
Thank you.

Answer the question

In order to leave comments, you need to log in

1 answer(s)
O
oleksandr_veles, 2015-12-09
@oleksandr_veles

Remarks.
1. Xeon E5-2640 or 2660 will be more optimal, and 2640 and 2 times cheaper.
e5-2xxx do not work in 4x processor mode, 4xxx are needed there.
2. GPU: Nvidia Tesla K80 - I'm not sure if 2 pieces will fit into 1U.
By optimality, it is better to take titan black or titan z if you do not need a lifetime warranty (they will still become obsolete in 3-4 years). If you only need single precision for a number crusher, titan x is a great solution.
3. Why ssd 250? 40-60GB is enough for the system, external storage is better for data. With 128GB of memory, you can always bite off 100GB for a fast disk if anything.

Didn't find what you were looking for?

Ask your question

Ask a Question

731 491 924 answers to any question