[Python]: mpi4py parallel numpy dot product

Question

So I was trying to parallel the numpy's dot product using mpi4py on a cluster. The basic idea is to split the first matrix to smaller ones, multiply the smaller ones with the second matrix and the stack the results to one.

I am facing some issues though the result of the parallel multiplication is different than the one running on one thread except for the first row.

from mpi4py import MPI
import numpy as np

comm = MPI.COMM_WORLD
world = comm.size
rank = comm.Get_rank()
name = MPI.Get_processor_name()

a = np.random.randint(10, size=(10, 10))
b = np.random.randint(10, size=(10, 10))

c = np.dot(a, b)

# Parallel Multiplication
if world == 1:
    
    result = np.dot(a, b)

else:
    
    if rank == 0:
        
        a_row = a.shape[0]
    
        if a_row >= world:
            
            split = np.array_split(a, world, axis=0)
            
    else:
        
        split = None
        
    split = comm.scatter(split, root=0)
    
    split = np.dot(split, b)
    
    data = comm.gather(split, root=0)

    if rank == 0:
    
        result = np.vstack(data)

# Compare matrices
if rank == 0:
    
    print("{} - {}".format(result.shape, c.shape))
    
    if np.array_equal(result, c):
        
        print("Multiplication was successful")
    
    else:
        
        print("Multiplication was unsuccessful")
        
        print(result - c)

I have tried to execute the split, scatter, gather, vstack commands without the dot product. The gathered stacked matrix was the matrix A. That, probably, means that the gathered indices aren't getting shuffled between the processes. Since I think that it is impossible for the np.dot to fail doing the dot product correctly, I guess that my issue is my algorithm. What am I missing here?

Your algorithm works perfectly. Your problem comes from matrix b being different on each process because it is generated randomly on each of them. — bousof
– bousof, Commented Jul 7, 2020 at 18:28

bousof · Accepted Answer · 2020-07-07 18:21:00Z

4

You are getting this error because matrix b is generated randomly by the two processors so it's not the same in both. Consider generating b in process of rank 0 and then send it to other processors. Replace line:

b = np.random.randint(10, size=(10, 10))

By

if rank == 0:
    b = np.random.randint(10, size=(10, 10))
else:
    b = None

b = comm.bcast(b, root=0)

edited Jul 7, 2020 at 18:21

answered Jul 7, 2020 at 18:12

bousof

1,25111 silver badges14 bronze badges

Sign up to request clarification or add additional context in comments.

Collectives™ on Stack Overflow

[Python]: mpi4py parallel numpy dot product

1 Answer 1

Comments

Your Answer

Hot Network Questions

Collectives™ on Stack Overflow

1 Answer 1

Comments

Your Answer

Sign up or log in

Post as a guest

Related