1

I have an assignment to do a matrix multiplication with forks, using shared memory, then compare time results with the multiplication without forks, so here is the multiplication without them:

int matrizA[Am][An];
int matrizB[An][Bp];

//here i fill the matrix from a .txt file

int matrizR[Am][Bp];
int a,b,c;
for (a=0; a < Am; a++){
    for (b = 0; b < Bp; b++)
    {
        matrizR[a][b] = 0;
        for (c=0; c<An; c++){
            matrizR[a][b] += matrizA[a][c] * matrizB[c][b]; 
        }
    }
}

Then i try to implement forks but the results are wrong, im not sure if I have to implement shared memory and where, are matrizA, matrizB, and matrizR2 should be shared? how i do this?

int matrizR2[Am][Bp];
pid_t pids[Am][Bp];
int h,j;

/* Start children. */TY
for (h = 0; h < Am; ++h) {
    for (j=0; j<Bp ; ++j){
      if ((pids[h][j] = fork()) < 0) {
        perror("fork");
        abort();
      } else if (pids[h][j] == 0) {
        matrizR2[h][j] = 0;
        for (c=0; c<An; c++){
            matrizR2[h][j] += matrizA[h][c] * matrizB[c][j]; 
        }
        printf("im fork %d,%d\n",h,j);
        exit(0);
      }
    }
}
/* Wait for children to exit. */
int status;
pid_t pid;
while (n > 0) {
  pid = wait(&status);
  --n; 
}
6
  • When you fork a new process, you actually create a new process, and the memory for any process is separate from any other process. So yes you need shared memory, or you could use threads instead. Commented Sep 30, 2015 at 5:42
  • See stackoverflow.com/questions/13274786/… Commented Sep 30, 2015 at 5:48
  • how do I implement shared memory in this example? i read this, but trully didnt understand must of it link Commented Sep 30, 2015 at 5:51
  • @Stas so like that example the glob_var in my case should be all the matrix? how do i initialize them? Commented Sep 30, 2015 at 5:54
  • Is it a requirement that you must use processes and shared memory? Otherwise threads is much easier. Commented Sep 30, 2015 at 5:58

1 Answer 1

2

Not giving a complete solution because it’s a homework assignment, but the functions you use to get shared memory on Unix are documented here: http://pubs.opengroup.org/onlinepubs/009695399/functions/shmget.html

You would call shmget() and then pass the identifier that gives you to shmat(), in each child process, to get a pointer to shared memory.

One alternative is to have each process pass back its results in a pipe, and copy them, but this will be much slower. Another is to use threads instead of processes, since threads share memory. Another is to pass messages. Another is a memory-mapped file. But shared memory that you cast to a pointer to a structure is the simplest way to go, and has the best performance.

Finally, if you are writing to the same shared memory, you need to be careful not to let two processes write to the same memory. If you open one process per row, and your rows are properly aligned, you shouldn’t have an issue, but the safe way to do this is to use either locks or atomic variables.

Sign up to request clarification or add additional context in comments.

4 Comments

Are you sure that using pipes will be much slower than shared memory? Imagine each child processes one row, binary writes it to a pipe (one pipe per child), and parent process just collates what it reads from the pipes. It should be harder to write, but not sure about the time. But I'm pretty sure it would be faster than using shared memory and a global lock at each variable write (upvoted anyway :-) ).
but as @lorehead said, i should not have to use locks because each process would write to a diferent part of the array, so there would never be two process written on the same memory space, im i right?
Atomic variables should not require a global lock at each write. In fact, if no process’ section of the array shares a cache line with any other, the processes should all just be able to write to their section with no overhead at all. You could, I suppose, also use a different shared memory chunk for each object. But to answer your question: consider how many copies of each piece of the array would have to be made to transfer them over pipes.
Here’s a kind of hackish solution that avoids having to deal with any atomic stuff: declare the rows as an array of pointers to row vectors, then initialize each to a segment of shared memory that only one daughter process will open. It’s not a great solution in the real world because you’ll have no cache coherency and a lot of overhead, but it keeps the chunks of memory everyone touches completely separate in a transparent way.

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.