As I understand the code presented, the general objective is essentially to generate all the N-letter words that can be formed with a length-letter alphabet. Each recursive call to gen() corresponds to one letter position, and so each time control reaches the bottom of the recursion, the first N elements of arrayofindex represent the letters of one word.
But it should be obvious that multiple threads running in parallel cannot use the same arrayofindex. Each thread expects when it reaches the bottom of the recursion to find in arrayofindex the values that it set along its recursive path. That's fundamental to the approach. If other threads are modifying arrayofindex at the same time then they all likely get mishmashes of values set by different threads. Moreover, you probably don't get anything like the speedup you hope for, because the threads need to synchronize their accesses to arrayofindex.
Note: This problem has nothing in particular to do with recursion. You would have exactly the same issue if you modified the code to be iterative instead of recursive -- which I, myself, would in fact do if I were looking to improve performance, though I do not demonstrate that here.
There are various ways to give each OMP thread its own working array. If the space must continue to be dynamically allocated, then you should arrange for it to be allocated inside the parallel region, so that each thread allocates its own. If you're willing and able to rely on variable-length arrays for this, however, then probably the only other thing you need is an OMP private clause attached to your parallel for construct.
For example, this variation on your code works for me:
void gen_tail(int length, int num_letters, int arrayofindex[], int position) {
for (int letter = 0; letter < num_letters; letter++) {
arrayofindex[position] = letter;
if (position + 1 < length) {
gen_tail(length, num_letters, arrayofindex, position + 1);
} else {
// this is the bottom ... do something with arrayofindex, such as:
#pragma omp critical
{
for (int i = 0; i < length; i++) {
putchar('A' + arrayofindex[i]);
}
putchar('\n');
}
}
}
}
void gen(int length, int num_letters) {
assert(length > 1);
int arrayofindex[length]; // Note: VLA, _not_ dynamically allocated
// Marking the array 'private' means each thread gets its own copy.
// This would not have the needed effect if 'arrayofindex' were a pointer.
#pragma omp parallel for private(arrayofindex)
for (int letter = 0; letter < num_letters; letter++) {
arrayofindex[0] = letter;
gen_tail(length, num_letters, arrayofindex, 1);
}
}
int main(void) {
gen(5, 4);
}
That emits the expected 1024 (== 45) results for me, all distinct, as I have every reason to expect it should do.
for(arrayofindex[index]; ...supposed to accomplish for you? Is that a typo?criticalfunc()do? In particular, does it access the elements ofarrayofindex?arrayofindex = (long*) malloc(5);is almost surely wrong. Did you meanarrayofindex = malloc(5 * sizeof(long));?=instead of==inif(index = 0){, I'm done. As presented, this code has an infinite recursion. Update the code to present a bona fide minimal reproducible example, or we are unlikely to be able to help you.gen()with different arguments. OpenMP has no magic solution for that. But we can neither confirm nor suggest any mitigation without an MCVE to work with. You have not yet presented one, for there is no definition ofcriticalfunc()(which indeed seems to be critical), nor any example input, expected output, or actual (erroneous) output.