Why does -pthread arbitrarily speed up a program?

G

German2020-06-20 23:06:13

C++ / C#

German, 2020-06-20 23:06:13

Hello.
I have such a program;

#include <stdio.h>
#include <stdlib.h>
#include <stdbool.h>
#include <omp.h>
#include <time.h>

#define ARRAY_SIZE 1024

//структуры для хранения стартового и конечного времени
struct timespec start_time = {0}, finish_time = {0};
//переменная для расчёта дельты времени в наносекундах
long time_delta = 0;

//функция сортирующая массив "пузырьком"
void bubble_sort(unsigned int* array) {
    unsigned int tmp = 0;
    bool no_swap = 0;
    for (unsigned int i = ARRAY_SIZE - 1; i >= 0; --i)
    {
        no_swap = 1;
        {
            //#pragma omp paralel for num_threads(4)
            for (unsigned int j = 0; j < i; j++)
            {
                if (array[j] > array[j + 1])
                {
                    tmp = array[j];
                    array[j] = array[j + 1];
                    array[j + 1] = tmp;
                    no_swap = 0;
                }
            }
        }
        if (no_swap)
            break;
    }
}

int main(int argc, char* argv[]) {
    (void)argc;
    (void)argv;
    //настроим генератор случайных чисел
    srand(time(NULL));
    //выделим память для теста
    unsigned int* array = malloc(sizeof(unsigned int) * ARRAY_SIZE);
    if(!array) { return -1; }
    //сгенерируем массив
    for(unsigned int i = 0; i < ARRAY_SIZE; ++i) {
        array[i] = rand() % ARRAY_SIZE;
    }
    //замеряем стартовое время процесса
    clock_gettime(CLOCK_PROCESS_CPUTIME_ID, &start_time);
    
    //отсортируем массив
    bubble_sort(array);

    //замеряем конечное время
    clock_gettime(CLOCK_PROCESS_CPUTIME_ID, &finish_time);
    time_delta = (1000000000 * (finish_time.tv_sec - start_time.tv_sec)) + (finish_time.tv_nsec - start_time.tv_nsec);
    printf("Затраченное время: %li\n", time_delta);
    free(array);
    return 0;
}

If I compile it like this: gcc main.c -o prog , then I get a number approximately equal to 20000000, however, if I add the -pthread flag, then the running time will decrease to 12000000, I don’t understand why this happens, why when adding this flag, execution time falls by itself?

Initially, I wanted to test the time gain from using various OpenMP directives, but since the -fopenmp flag automatically includes -pthread , then the work time drops by itself, and this already spoils the accuracy.

What to do?

Reply

Answer the question

In order to leave comments, you need to log in

1 answer(s)

A

Armenian Radio, 2020-06-21
@gbg

Herman ,
1) Bench should be done under the most controlled conditions, on a machine that, apart from the OS and your program, is not running, for example, a browser with a bunch of tabs.
2) There is no point in benchmarking on such a sweet thing as 1024 elements, it will be done in microseconds, and you have an error of the order of milliseconds, which indicates more likely a crap timer (or that you got the original 8080 somewhere)
3) Bench the bubble sorting meaning even less, to whom did she surrender with her N^2 complexity? Other sorts are more suitable for parallelization
First, increase the size of the sort to be sorted by a factor of a million.