Recently I wanted to speed up some of my code using parallel processing, as I have a Quad Core i7 and it seemed like a waste. I learned about python's (I'm using v 3.3.2 if it maters) GIL and how it can be overcome using the multiprocessing module, so I wrote this simple test program:
from multiprocessing import Process, Queue
def sum(a,b):
su=0
for i in range(a,b):
su+=i
q.put(su)
q= Queue()
p1=Process(target=sum, args=(1,25*10**7))
p2=Process(target=sum, args=(25*10**7,5*10**8))
p3=Process(target=sum, args=(5*10**8,75*10**7))
p4=Process(target=sum, args=(75*10**7,10**9))
p1.run()
p2.run()
p3.run()
p4.run()
r1=q.get()
r2=q.get()
r3=q.get()
r4=q.get()
print(r1+r2+r3+r4)
The code runs in about 48 seconds measured using cProfile, however the single process code
def sum(a,b):
su=0
for i in range(a,b):
su+=i
print(su)
sum(1,10**9)
runs in about 50 seconds. I understand that the method has overheads but i expected the improvements to be more drastic. The error with fork() doesn't apply to my as I'm running the code on a Mac.
multiprocessingstarts separate processes, which get separate rows in Activity Monitor (generally all called "Python").multiprocessing.Poolorconcurrent.futures.ProcessPoolExecutorthan explicitProcesses andQueues. For example, compare this.