0

I am trying to parallelize a nested for loop below using allgather

for (int i=0; i<N1; i++) {
        for (int j=0; j<N0; j++)
            HS_1[i] += IN[j]*W0[j][i];
    }

Here N1 is 1000 and N2 is 764.

I have four processes and I just want to parallelize the outer loop. Is there a way to do it?

1 Answer 1

1

This looks like a matrix-vector multiplication. Let's assume that you've distributed the HS output vector. Each component needs the full IN vector, so you indeed need an allgather for that. You also need to distribute the W0 matrix: each process gets part of the i indices, and all of the j indices.

Sign up to request clarification or add additional context in comments.

3 Comments

IN and W0 is available to all other process, In that case, will the following code work?
partition = N1/num_proc;</br> start = rank * partition; end = start + partition; double hs_1[partition]; for (int i = start; i < end; i++){ for (int j = 0; j < N0; j++) hs_1[i-start] += N[j] * W0[j][i]; } MPI_Allgatherv(hs_1, N1/rank, MPI_DOUBLE, HS_1, N1/rank, MPI_DOUBLE, MPI_COMM_WORLD);
That code looks correct, but it's pointless. You have your matrix on each process, but they only use a small part of it. And why do you gather the result at all? A good MPI application works completely distributed.

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.