Parallel sections code with nested loops in openmp

Question

I made this parallel code to share the iterations like first and last, fisrst+1 and last-1,... But I don't know how to improve the code in every one of the two parallel sections because I have an inner loop in the sections and I can't think of any way to simplify it, thanks.

This isn't about which values are stored in x or y, I use this sections design because the requisite is execute the iterations from 0 to N like: 0 N, 1 N-1, 2 N-2 but I would like to know if I can optimize the inner loops maintaining this model

int x = 0, y = 0,k,i,j,h;
#pragma omp parallel private(i, h) reduction(+:x, y)
    {
            #pragma omp sections
            {
                    #pragma omp section
                    {
                            for (i=0; i<N/2; i++)
                            {
                                    C[i] = 0;
                                    for (j=0; j<N; j++)
                                    {
                                        C[i] += MAT[i][j] * B[j];
                                    }
                                    x += C[i];
                            }
                    }
                    #pragma omp section
                    {
                            for (h=N-1; h>=N/2; h--) 
                            {
                                    C[h] = 0;
                                    for (k=0; k<N; k++)
                                    {
                                        C[h] += MAT[h][k] * B[k];
                                    }
                                    y += C[h];
                            }
                    }
            }
    }
    x = x + y;

Homer512 · Accepted Answer · 2021-12-24 10:37:16Z

2

Using sections seems like the wrong approach. A pragma omp for seems more appropriate. Also note that you forgot to declare j private.

int x = 0, y = 0,k,i,j;
#pragma omp parallel private(i,j) reduction(+:x, y)
{
#   pragma omp for nowait
    for(i=0; i<N/2; i++) {
        // local variable to make the life easier on the compiler
        int ci = 0;
        for(j=0; j<N; j++)
            ci += MAT[i][j] * B[j];
        x += ci;
        C[i] = ci; 
    }
#   pragma omp for nowait
    for(i=N/2; i < N; i++) {
        int ci = 0;
        for(j=0; j<N; j++)
            ci += MAT[i][j] * B[j];
        y += ci;
        C[i] = ci;
    }
}
x = x + y;

Also, I'm not sure but if you just want x as your final output, you can simplify the code even further:

int x=0, i, j;
#pragma omp parallel for reduction(+:x) private(i,j)
for(i=0; i < N; ++i)
    for(j=0; j < N; ++j)
        x += MAT[i][j] * B[j];

answered Dec 24, 2021 at 10:37

Homer512

15.1k2 gold badges16 silver badges35 bronze badges

Sign up to request clarification or add additional context in comments.

4 Comments

JamesR Over a year ago

The problem is that the iterations like first and last, fisrst+1 and last-1,... is a requirement even though there are better alternatives that's why I use the sections without nowait

Homer512 Over a year ago

@JamesR But why? Is your actual algorithm more complicated?

Victor Eijkhout Over a year ago

Instead of declaring i,j private, I'd declare them in the loop header: for (int i=whatever).

Homer512 Over a year ago

@VictorEijkhout My C knowledge is a bit rusty but AFAIR that wouldn't be valid C, right? And the post is tagged with C, not C++

ZAWK · Accepted Answer · 2021-12-29 18:45:35Z

1

The section construct is to distribute different tasks to different threads and each section block marks a different task so you will not be able to do that iterations in the order you want I answered you here:

Distribution of loop iterations between threads with a specific order

But I want to clarify that the requirement to use sections is that each block must be independent of the other blocks.

answered Dec 29, 2021 at 18:45

ZAWK

616 bronze badges

Comments

Victor Eijkhout · Accepted Answer · 2021-12-24 14:38:09Z

0

A section gets only one thread, so you can't make the loops parallel. How about

Make a parallel loop to N at the top level,
then inside each iteration use a conditional to decide whether to accumulate into x,y?

Although @Homer512 's solution looks correct to me too.

answered Dec 24, 2021 at 14:38

Victor Eijkhout

5,9902 gold badges29 silver badges31 bronze badges

5 Comments

JamesR Over a year ago

This isn't about wich values are stored in x or y, I use this sections design because the requisite is execute the iterations from 0 to N like: 0 N, 1 N-1, 2 N-2 but I would like to know if I can optimize the inner loops maintaining this model

Victor Eijkhout Over a year ago

Why do iterations need to be executed like that? I see nothing in the code that requires it.

JamesR Over a year ago

Nothing special, it's a request from my teacher

Victor Eijkhout Over a year ago

It doesn't make sense, to insist on a sequential ordering in a parallel program. Anyway, if it really needs to be done in that sequence, then you need to make sure you limit it to two threads, and then your original code is a/the correct solution. But it seems like a pointless exercise to me.

JamesR Over a year ago

Ithink it's only to make sure we learn how sections work

Collectives™ on Stack Overflow

Parallel sections code with nested loops in openmp

3 Answers 3

4 Comments

Comments

5 Comments

Your Answer

Linked

Hot Network Questions

Collectives™ on Stack Overflow

3 Answers 3

4 Comments

Comments

5 Comments

Your Answer

Sign up or log in

Post as a guest

Linked

Related