Answer the question
In order to leave comments, you need to log in
How will MPI behave with CUDA (NVlink or PCI or CPU)?
Hello. A question about the interaction of MPI with CUDA, namely, what transmission technology will be used?
There is MVAPICH2 2.3 built with --enable-cuda.
Set environment variable MV2_USE_CUDA = 1 to enable MPI support with CUDA 9+
AND there are 2 cards on the node that are connected by nvlink directly and pci directly (i.e. support P2P and RDMA).
2 MPI threads are launched on such a node, the first thread uses GPU_0, the second thread uses GPU_1.
And there is this code
cudaSetDevice(rank);
double *d_out, *d_in;
cudaMemcpyManaged((void**)& d_out, size);
cudaMemcpyManaged((void**)& d_in, size);
// тут инициализация d_out
// И вопрос в этом месте??
// По какому каналу памяти пойдут данные, когда я вызову mpi_функцию для передачи данных прямиком из GPU памяти в GPU память
MPI_Status status;
// 1. Будет ли этот вызов использовать P2P (по nvlink передача) или RDMA (по pci) или CPU (через хоста будет обмен) ??
MPI_Sendrecv(d_out, count, MPI_DOUBLE, rankTo, 777, d_in, count, MPI_DOUBLE, rankTo, 777, &status);
// 2. Или нужно включить P2P и тогда MPI будет его использовать
cudaDeviceEnablePeerAccess(rankTo, 0);
MPI_Sendrecv(d_out, count, MPI_DOUBLE, rankTo, 777, d_in, count, MPI_DOUBLE, rankTo, 777, &status);
cudaDeviceDisablePeerAccess(rankTo);
// 3. Есть тестовый вариант через CPU
double *h_out, h_in;
h_out = new double[count];
h_in = new double[int];
cudaMemcpy(h_out, d_out, size, cudaMemcpyDeviceToHost);
MPI_Sendrecv(h_out, count, MPI_DOUBLE, rankTo, 777, h_in, count, MPI_DOUBLE, rankTo, 777, &status);
cudaMemcpy(d_in, h_out, size, cudaMemcpyHostToDevice);
delete[] h_out;
delete[] h_in;
Answer the question
In order to leave comments, you need to log in
Didn't find what you were looking for?
Ask your questionAsk a Question
731 491 924 answers to any question