By using OpenMP I'm trying to parallelize the creation of a kind of dictionary so defined.
typedef struct Symbol {
int usage;
char character;
} Symbol;
typedef struct SymbolDictionary {
int charsNr;
Symbol *symbols;
} SymbolDictionary;
I did the following code.
#include <stdlib.h>
#include <stdio.h>
#include <string.h>
#include <stdbool.h>
#include <omp.h>
static const int n = 10;
int main(int argc, char* argv[]) {
int thread_count = strtol(argv[1], NULL, 10);
omp_set_dynamic(0);
omp_set_num_threads(thread_count);
SymbolDictionary **symbolsDict = calloc(omp_get_max_threads(), sizeof(SymbolDictionary*));
SymbolDictionary *dict = NULL;
int count = 0;
#pragma omp parallel for firstprivate(dict, count) shared(symbolsDict)
for (int i = 0; i < n; i++) {
if (count == 0) {
dict = calloc(1, sizeof(SymbolDictionary));
dict->charsNr = 0;
dict->symbols = calloc(n, sizeof(Symbol));
#pragma omp critical
symbolsDict[omp_get_thread_num()] = dict;
}
dict->symbols[count].usage = i;
dict->symbols[count].character = 'a' + i;
++dict->charsNr;
++count;
}
if (omp_get_max_threads() > 1) {
// merge the dictionaries
}
for (int j = 0; j < symbolsDict[0]->charsNr; j++)
printf("symbolsDict[0][%d].character: %c\nsymbolsDict[0][%d].usage: %d\n",
j,
symbolsDict[0]->symbols[j].character,
j,
symbolsDict[0]->symbols[j].usage);
for (int i = 0; i < omp_get_max_threads(); i++)
free(symbolsDict[i]->symbols);
free(symbolsDict);
return 0;
}
The code compiles and runs, but I'm not sure about how the omp block works and if I implemented it correctly. Especially I have to attach the dict with the symbolsDict at the beginning of the loop, because I don't know when a thread will complete its work. However, by doing that probably different threads will write inside symbolsDict at the same time but in different memory. Although the threads will use different access points, dict should be different for every thread, I'm not sure this is a good way to do that.
I tested the code with different threads and creating dictionaries of different sizes. I didn't have any kind of problem, but maybe it was just chance.
Basically I looked for the theory part around on the documentation. So I would like to know if I implemented the code correctly? If not, what is incorrect and why?
symbolsDict, you don't need that. Since every thread accesses a different entry, there is no race condition. Mostly I'm wondering how this parallel section is supposed to speed up your code. Seems to me like that merge code would take more time than just initializing all at once in a single threadstd::mapfor your dictionary, and do a reduction on that.omp_set_dynamic(0); omp_set_num_threads(thread_count);is not enough to guarantee this. You have 2 alternatives 1) Check the number of threads obtained inside the parallel block usingomp_get_num_threads()function, or alternatively 2) check before the parallel block thatthreadcount<=omp_get_thread_limit().