Evaluating the performance gain from multi-threading in python

Question

I tried to compare the performance gain from parallel computing using multithreading module and the normal sequence computing but couldn't find any real difference. Here's what I did:

import time, threading, Queue
q=Queue.Queue()

def calc(_range):
    exponent=(x**5 for x in _range)
    q.put([x**0.5 for x in exponent])

def calc1(_range):
    exponent=(x**5 for x in _range)
    return [x**0.5 for x in exponent]

def multithreds(threadlist):
    d=[]
    for x in threadlist:
        t=threading.Thread(target=calc, args=([x]))
        t.start()
        t.join()
        s=q.get()
        d.append(s)
    return d


threads=[range(100000), range(200000)]

start=time.time()
#out=multithreads(threads)
out1=[calc1(x)for x in threads]
end=time.time()
print end-start

Timing using threading:0.9390001297 Timing running in sequence:0.911999940872

The timing running in sequence was constantly lower than using multithreading. I have a feeling there's something wrong with my multithreading code.

Can someone point me in the right direction please. Thanks.

One of them is for the multithreading as I wanted the output returned in a queue. Hence I used one for threading, and the other for sequence. — user2274879
– user2274879, Commented Jan 16, 2016 at 13:00
Consider a more expensive function call - forking or starting threads isn't free, so for very small calls, the initializing overhead is very non-trivial compared to the actual "work." — user559633
– user559633, Commented Jan 16, 2016 at 13:52
@tristan, I think os.fork is too expensive for my main task. Thanks anyway. — user2274879
– user2274879, Commented Jan 16, 2016 at 14:06
Sorry, I meant that the overhead you're going to incur going with native threads or processes is going to be very large for such a small amount of work. I'd suggest looking into native threads for potential improvement — user559633
– user559633, Commented Jan 17, 2016 at 5:02

HelloWorld · Accepted Answer · 2016-01-16 13:47:55Z

1

The reference implementation of Python (CPython) has a so-called interpreter lock where always one thread executes Python byte-code. You can switch for example to IronPython which has no GIL or you can take a look at the multiprocessing module which spawns several Python processes which can execute your code independently. In some scenarios using threads in Python can even be slower than a single-thread because the context-switches between threads on the CPU also introduce some overhead.

Take a look at this page for some deeper insights and help.

If you want to dive much more deeper in this topic I can highly recommend this talk by David Beazley.

edited Jan 16, 2016 at 13:47

answered Jan 16, 2016 at 13:36

HelloWorld

2,0144 gold badges40 silver badges91 bronze badges

Sign up to request clarification or add additional context in comments.

3 Comments

user2274879 Over a year ago

I'll consider IronPython as for some reason I can't import multiprocessing module. However, as long as there's threading in CPython, then multithreading is possible. I believe there's a way out. Thanks.

user2274879 Over a year ago

I tried importing multiprocessing module but got this error message: 'no module named dummy'. However I checked the multiprocessing file at the python directory and found dummy. Any suggestions on what to do?

Jeremy Friesner Over a year ago

Multithreading is possible in CPython, but getting a speedup from multiple cores via multithreading is unlikely, due to the GIL.

Collectives™ on Stack Overflow

Evaluating the performance gain from multi-threading in python

1 Answer 1

3 Comments

Your Answer

Hot Network Questions

Collectives™ on Stack Overflow

1 Answer 1

3 Comments

Your Answer

Sign up or log in

Post as a guest

Related