OpenMP function calls in parallel

Question

I'm looking for a way to call a function in parallel.

For example, if I have 4 threads, I want to each of them to call the same function with their own thread id as an argument.

Because of the argument, no thread will work on the same data.

#pragma omp parallel
{
    for(int p = 0; p < numberOfThreads; p++)
    {
        if(p == omp_get_thread_num())
            parDF(p);
    }
}

Thread 0 should run parDF(0)

Thread 1 should run parDF(1)

Thread 2 should run parDF(2)

Thread 3 should run parDF(3)

All this should be done at the same time...

This (obviously) doesn't work, but what is the right way to do parallel function calls?

EDIT: The actual code (This might be too much information... But it was asked for...)

From the function that calls parDF():

omp_set_num_threads(NUM_THREADS);
#pragma omp parallel
{

    numberOfThreads = omp_get_num_threads();
    //split nodeQueue
    #pragma omp master
    {
        splitNodeQueue(numberOfThreads);
    }
    int tid = omp_get_thread_num();

    //printf("Hello World from thread = %d\n", tid);
    #pragma omp parallel for private(tid)
    for(int i = 0; i < numberOfThreads; ++i)
    {
            parDF(tid, originalQueueSize, DFlevel);
    }
}

The parDF function:

bool Tree::parDF(int id, int originalQueueSize, int DFlevel)
{
double possibilities[20];
double sequence[3];
double workingSequence[3];
int nodesToExpand = originalQueueSize/omp_get_num_threads();
int tenthsTicks = nodesToExpand/10;
int numPossibilities = 0;
int percentage = 0;
list<double>::iterator i;
list<TreeNode*>::iterator n;

cout << "My ID is: "<< omp_get_thread_num() << endl;

        while(parNodeQueue[id].size() > 0 and parNodeQueue[id].back()->depth == DFlevel)
        {

            if(parNodeQueue[id].size()%tenthsTicks == 0)
            {
                cout << endl;
                cout << percentage*10 << "% done..." << endl;
                if(percentage == 10)
                {
                    percentage = 0;
                }
                percentage++;
            }

            //countStartPoints++;
            depthFirstQueue.push_back(parNodeQueue[id].back());
            numPossibilities = 0;

            for(i = parNodeQueue[id].back()->content.sortedPoints.begin(); i != parNodeQueue[id].back()->content.sortedPoints.end(); i++)
            {

                for(int j = 0; j < deltas; j++)
                {
                    if(parNodeQueue[id].back()->content.doesPointExist((*i) + delta[j]))
                    {
                        for(int k = 0; k <= numPossibilities; k++)
                        {
                            if(fabs((*i) + delta[j] - possibilities[k]) < 0.01)
                            {
                                goto pointAlreadyAdded;
                            }
                        }
                        possibilities[numPossibilities] = ((*i) + delta[j]);
                        numPossibilities++;
                        pointAlreadyAdded:
                        continue;
                    }
                }
            }

            // Out of the list of possible points. All combinations of 3 are added, building small subtrees in from the node.
            // If a subtree succesfully breaks the lower bound, true is returned.

            for(int i = 0; i < numPossibilities; i++)
            {
                for(int j = 0; j < numPossibilities; j++)
                {
                    for(int k = 0; k < numPossibilities; k++)
                    {
                        if( k != j and j != i and i != k)
                        {
                            sequence[0] = possibilities[i];
                            sequence[1] = possibilities[j];
                            sequence[2] = possibilities[k];
                            //countSeq++;
                            if(addSequence(sequence, id))
                            {
                                //successes++;
                                workingSequence[0] = sequence[0];
                                workingSequence[1] = sequence[1];
                                workingSequence[2] = sequence[2];
                                parNodeQueue[id].back()->workingSequence[0] = sequence[0];
                                parNodeQueue[id].back()->workingSequence[1] = sequence[1];
                                parNodeQueue[id].back()->workingSequence[2] = sequence[2];
                                parNodeQueue[id].back()->live = false;
                                succesfulNodes.push_back(parNodeQueue[id].back());
                                goto nextNode;
                            }
                            else
                            {
                                destroySubtree(parNodeQueue[id].back());
                            }
                        }
                    }
                }
            }
            nextNode:
            parNodeQueue[id].pop_back();
        }

Do not forget to compile and link with OpenMP : -fopenmp with gcc. — dkg
– dkg, Commented Jan 12, 2015 at 14:34
I wouldn't use the line if(p == omp_get_thread_num()) as OpenMP will automatically get the avalaible threads to work on the inner of the loop. You shouldn't care about the number of the actual thread computing you data : what if you have only 2 threads available ? You will never get a true with p == omp_get_thread_num() for p = 2 or 3 then your loop will be executed four times by the thread number 0 and number 1. So you won't ever call parDF(2) and parDF(3). — dkg
– dkg, Commented Jan 12, 2015 at 15:10

sehe · Accepted Answer · 2015-01-12 13:56:13Z

5

Is this what you are after?

Live On Coliru

#include <omp.h>
#include <cstdio>

int main()
{

    int nthreads, tid;

#pragma omp parallel private(tid)
    {

        tid = ::omp_get_thread_num();
        printf("Hello World from thread = %d\n", tid);

        /* Only master thread does this */
        if (tid == 0) {
            nthreads = ::omp_get_num_threads();
            printf("Number of threads = %d\n", nthreads);
        }

    } /* All threads join master thread and terminate */
}

Output:

Hello World from thread = 0
Number of threads = 8
Hello World from thread = 4
Hello World from thread = 3
Hello World from thread = 5
Hello World from thread = 2
Hello World from thread = 1
Hello World from thread = 6
Hello World from thread = 7

edited Jan 12, 2015 at 13:56

answered Jan 12, 2015 at 13:50

sehe

400k49 gold badges475 silver badges673 bronze badges

Sign up to request clarification or add additional context in comments.

8 Comments

dkg Over a year ago

Shouldn't you add num_threads(4) at the end of the #pragma ? Like this : #pragma omp parallel private(tid) num_threads(4)then you have 4 threads executing your code.

sehe Over a year ago

@dkg your guess is as good as anyone's. Why would I? The OP is also using all available threads as far as I can tell

dkg Over a year ago

@sehe : OpenMP dynamically set the number of threads. If the number choosen by OpenMP is lower than the number of times you want your function to be executed then you won’t get your desired results. Otherwise you should place the call in a loop iterating the number of desired times.

sehe Over a year ago

@dkg I know all this. I'm assuming the OP also knows, and he was just asking about how to achieve... well what he asks: "For example, if I have 4 threads, I want to each of them to call the same function with their own thread id as an argument.". I think you're somehow reading a different question. (Did you notice you can replace printf with parDF e.g.?)

MikkelSecher Over a year ago

I put in some prints that sits in the beginning of the parDF function. It looks like the function is called several times by each thread, but without running all the code in the function. "My ID is: X" is printed a couple of times for each thread in random order, and then it looks like it runs the function 4 consecutive times... There are a few loops in the parDF function, but since they are called from a #pragma omp parallel block, they should be run individually by each thread, right?

|

Anshul Sharma · Accepted Answer · 2015-01-12 13:57:42Z

5

You should be doing something like this :

#pragma omp parallel private(tid)
{ 
    tid = omp_get_thread_num();
    parDF(tid);
}

I think its quite straight forward.

answered Jan 12, 2015 at 13:57

Anshul Sharma

3514 silver badges15 bronze badges

3 Comments

sehe Over a year ago

@user3162941 You should try it before you claim this. Unless numthreads is already configured to be 1, you are just wrong. My answer shows exactly the same and it's live on Coliru, so you can even check this from your lounge chair.

MikkelSecher Over a year ago

The function is called, but the entire code is not executed. I print each threads ID in the beginning of the function, but the actual loops, where it does something is run only once... Do I need pragma parallel block inside the called function to get it to run in parallel or is it enough that the function is called from a parallel block?

Anshul Sharma Over a year ago

Can you post the code of function? Its a possibility that you doing something that creates a bottleneck to be executed in parallel or the later instructions is changing the same resources in similar way.

jepio · Accepted Answer · 2015-01-12 13:54:59Z

3

There are two ways to achieve what you want:

Exactly the way you are describing it: each thread starts the function with it's own thread id:
```
#pragma omp parallel
{
    int threadId = omp_get_thread_num();
    parDF(threadId);
}
```
The parallel block starts as many threads as the system reports that it supports, and each of them executes the block. Since they differ in threadId, they will process different data. To force that starting of more threads you can add a numthreads(100) or whatever to the pragma.
The correct way to do what you want is to use a parallel for block.
```
#pragma omp parallel for
for (int i=0; i < numThreads; ++i) {
    parDF(i);
}
```
This way each iteration of the loop (value of i) gets assigned to a thread, that executes it. As many iterations will be ran in parallel, as there are available threads.

Method 1. is not very general, and is inefficient because you have to have as many threads as you want function calls. Method 2. is the canonical (right) way to get your problem solved.

answered Jan 12, 2015 at 13:54

jepio

2,2911 gold badge13 silver badges16 bronze badges

2 Comments

MikkelSecher Over a year ago

That was my thought exactly, but this executes in a sequential way. With 2 threads it first it runs the parDF(0) and then, when it is done, it runs parDF(1)...

jepio Over a year ago

This sounds like you have a problem with your runtime environment. Are you sure OpenMP is even working for you? What machine/system are you using, how are you compiling the program?

Collectives™ on Stack Overflow

OpenMP function calls in parallel

3 Answers 3

8 Comments

3 Comments

2 Comments

Your Answer

Hot Network Questions

Collectives™ on Stack Overflow

3 Answers 3

8 Comments

3 Comments

2 Comments

Your Answer

Sign up or log in

Post as a guest

Related