P
P
ptizza2013-04-12 11:11:42
Programming
ptizza, 2013-04-12 11:11:42

Problem of dynamic loop parallelization in MPI?

We have a cycle like this:

for (int i = 0; i <= n ; ++i )
 sum += calc(i);

The execution time of the calc() function is always different and cannot be determined in advance.
How can the loop be parallelized so that, on average, all machines are equally loaded?
Please help me to solve the problem.

Answer the question

In order to leave comments, you need to log in

3 answer(s)
1
1nd1go, 2013-04-12
@1nd1go

Similar to work stealing

M
m08pvv, 2013-04-12
@m08pvv

Producer-consumer, and if the execution time of calc is small, then you can issue tasks in small batches (several i).

P
petch, 2014-01-10
@petch

The root (zero) process distributes the values ​​of i, all the rest calculate sum according to the received i. The acceleration is obtained not in P (the number of involved processes) times, but in P-1, since one process does not participate in the calculations.

#include <stdio.h>
#include <stdlib.h>
#include <mpi.h>
#include <unistd.h>
#include <time.h>

int calc(int i) {
  int time = rand() % 1000;
  printf("calc(%d) takes %d ms\n", i, time);
  usleep(time * 1000);
  return time;
}

int main() {
  MPI_Init(0, 0);

  int rank, size;
  MPI_Comm_rank(MPI_COMM_WORLD, &rank);
  MPI_Comm_size(MPI_COMM_WORLD, &size);
  std::cout << "Hello from " << rank << std::endl;

  int sum = 0, n = 100;
  MPI_Status status;
  srand(time(NULL) + rank);

  clock_t begin = clock();
  if (rank == 0) {
    int dest, finish = -1;
    for (int i = 0; i < n; i++) {
      MPI_Recv(&dest, 1, MPI_INT, MPI_ANY_SOURCE, 0, MPI_COMM_WORLD, &status);
      MPI_Send(&i, 1, MPI_INT, status.MPI_SOURCE, 1, MPI_COMM_WORLD);
    }
    for (int p = 0; p < size - 1; p++) {
      MPI_Recv(&dest, 1, MPI_INT, MPI_ANY_SOURCE, 0, MPI_COMM_WORLD, &status);
      MPI_Send(&finish, 1, MPI_INT, status.MPI_SOURCE, 1, MPI_COMM_WORLD);
    }
  } else {
    int i = 0;
    while (i >= 0) {
      MPI_Send(&rank, 1, MPI_INT, 0, 0, MPI_COMM_WORLD);
      MPI_Recv(&i, 1, MPI_INT, 0, 1, MPI_COMM_WORLD, &status);
      if (i >= 0) {
        printf("process %d ", rank);
        sum += calc(i);
      }
    }
  }
  clock_t end = clock();
  double time_spent = (double)(end - begin) / CLOCKS_PER_SEC;

  MPI_Allreduce(&sum, &sum, 1, MPI_INT, MPI_SUM, MPI_COMM_WORLD);

  if (rank == 0) {
    printf("Program execution time is  %f s\n", time_spent);
    printf("Processes calc(i) takes    %f s\n", sum / 1000.0);
    printf("CPU time (without root) is %f s\n", (size - 1) * time_spent);
  }

  MPI_Finalize();
  return 0;
}

Didn't find what you were looking for?

Ask your question

Ask a Question

731 491 924 answers to any question