0

By using OpenMP I'm trying to parallelize the creation of a kind of dictionary so defined.

typedef struct Symbol {
    int usage;
    char character; 
} Symbol;

typedef struct SymbolDictionary {
    int charsNr; 
    Symbol *symbols; 
} SymbolDictionary;

I did the following code.

#include <stdlib.h>
#include <stdio.h>
#include <string.h>
#include <stdbool.h>
#include <omp.h>

static const int n = 10;

int main(int argc, char* argv[]) {
  int thread_count = strtol(argv[1], NULL, 10);
    omp_set_dynamic(0);
    omp_set_num_threads(thread_count);

  SymbolDictionary **symbolsDict = calloc(omp_get_max_threads(), sizeof(SymbolDictionary*));
  SymbolDictionary *dict = NULL;
  int count = 0; 
  #pragma omp parallel for firstprivate(dict, count) shared(symbolsDict)
  for (int i = 0; i < n; i++) {
    if (count == 0) {
      dict = calloc(1, sizeof(SymbolDictionary));
      dict->charsNr = 0;
      dict->symbols = calloc(n, sizeof(Symbol));

      #pragma omp critical
      symbolsDict[omp_get_thread_num()] = dict;
    }

    dict->symbols[count].usage = i;
    dict->symbols[count].character = 'a' + i;
    ++dict->charsNr;
    ++count;
  }

  if (omp_get_max_threads() > 1) {
    // merge the dictionaries
  }

  for (int j = 0; j < symbolsDict[0]->charsNr; j++)
    printf("symbolsDict[0][%d].character: %c\nsymbolsDict[0][%d].usage: %d\n",
      j,
      symbolsDict[0]->symbols[j].character,
      j,
      symbolsDict[0]->symbols[j].usage);

  for (int i = 0; i < omp_get_max_threads(); i++)
    free(symbolsDict[i]->symbols);
  
  free(symbolsDict);
  return 0;
}

The code compiles and runs, but I'm not sure about how the omp block works and if I implemented it correctly. Especially I have to attach the dict with the symbolsDict at the beginning of the loop, because I don't know when a thread will complete its work. However, by doing that probably different threads will write inside symbolsDict at the same time but in different memory. Although the threads will use different access points, dict should be different for every thread, I'm not sure this is a good way to do that.

I tested the code with different threads and creating dictionaries of different sizes. I didn't have any kind of problem, but maybe it was just chance.

Basically I looked for the theory part around on the documentation. So I would like to know if I implemented the code correctly? If not, what is incorrect and why?

7
  • The question is a bit hard to find in this post. Based on "I'm not sure how the omp block works and if I implemented it correctly" is your question "How does the omp block work, and did I implement it correctly?" Can you please edit to make your question clear? Commented Dec 29, 2022 at 21:33
  • 3
    If you are asking about the critical section to protect the symbolsDict, you don't need that. Since every thread accesses a different entry, there is no race condition. Mostly I'm wondering how this parallel section is supposed to speed up your code. Seems to me like that merge code would take more time than just initializing all at once in a single thread Commented Dec 29, 2022 at 21:35
  • @Homer512, this is an experiment and I would like to compare the two implementations, the serial one and the multi thread. The aim is to learn how to work with OpenMP but also to compare the results. Commented Dec 29, 2022 at 21:43
  • Maybe you should learn C++. You could use a std::map for your dictionary, and do a reduction on that. Commented Dec 29, 2022 at 21:56
  • 1
    @Scotty I just wanted to emphasize that if you write a code that depends on the number of threads used, you have to make sure that you actually obtain them. Note that using omp_set_dynamic(0); omp_set_num_threads(thread_count); is not enough to guarantee this. You have 2 alternatives 1) Check the number of threads obtained inside the parallel block using omp_get_num_threads() function, or alternatively 2) check before the parallel block that threadcount<=omp_get_thread_limit(). Commented Dec 30, 2022 at 14:26

1 Answer 1

1

different threads will write inside symbolsDict at the same time but in different memory. Although the threads will use different access points, dict should be different for every thread, I'm not sure this is a good way to do that.

It isn't a good way but it is safe. A cleaner way would be this:

SymbolDictionary **symbolsDict = calloc(
      omp_get_max_threads(), sizeof(SymbolDictionary*));

#pragma omp parallel
{
    SymbolDictionary *dict = calloc(1, sizeof(SymbolDictionary));
    int count = 0;
    dict->charsNr = 0;
    dict->symbols = calloc(n, sizeof(Symbol));
    symbolsDict[omp_get_thread_num()] = dict;
#   pragma omp for nowait
    for(int i = 0; i < n; i++) {
        dict->symbols[count].usage = i;
        dict->symbols[count].character = 'a' + i;
        ++dict->charsNr;
        ++count;
    }
}

Note that the inner pragma is omp for, not omp parallel for so it is using the outer parallel block to distribute its work. The nowait is a performance improvement that avoids a thread barrier at the end of the loop since it is the last part of the parallel section and threads wait for all other threads at the end of the section anyway.

Sign up to request clarification or add additional context in comments.

Comments

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.