0

I tried to compare the performance gain from parallel computing using multithreading module and the normal sequence computing but couldn't find any real difference. Here's what I did:

import time, threading, Queue
q=Queue.Queue()

def calc(_range):
    exponent=(x**5 for x in _range)
    q.put([x**0.5 for x in exponent])

def calc1(_range):
    exponent=(x**5 for x in _range)
    return [x**0.5 for x in exponent]

def multithreds(threadlist):
    d=[]
    for x in threadlist:
        t=threading.Thread(target=calc, args=([x]))
        t.start()
        t.join()
        s=q.get()
        d.append(s)
    return d


threads=[range(100000), range(200000)]

start=time.time()
#out=multithreads(threads)
out1=[calc1(x)for x in threads]
end=time.time()
print end-start

Timing using threading:0.9390001297 Timing running in sequence:0.911999940872

The timing running in sequence was constantly lower than using multithreading. I have a feeling there's something wrong with my multithreading code.

Can someone point me in the right direction please. Thanks.

5
  • Why do you have two calc functions? Commented Jan 16, 2016 at 12:57
  • One of them is for the multithreading as I wanted the output returned in a queue. Hence I used one for threading, and the other for sequence. Commented Jan 16, 2016 at 13:00
  • Consider a more expensive function call - forking or starting threads isn't free, so for very small calls, the initializing overhead is very non-trivial compared to the actual "work." Commented Jan 16, 2016 at 13:52
  • @tristan, I think os.fork is too expensive for my main task. Thanks anyway. Commented Jan 16, 2016 at 14:06
  • Sorry, I meant that the overhead you're going to incur going with native threads or processes is going to be very large for such a small amount of work. I'd suggest looking into native threads for potential improvement Commented Jan 17, 2016 at 5:02

1 Answer 1

1

The reference implementation of Python (CPython) has a so-called interpreter lock where always one thread executes Python byte-code. You can switch for example to IronPython which has no GIL or you can take a look at the multiprocessing module which spawns several Python processes which can execute your code independently. In some scenarios using threads in Python can even be slower than a single-thread because the context-switches between threads on the CPU also introduce some overhead.

Take a look at this page for some deeper insights and help.

If you want to dive much more deeper in this topic I can highly recommend this talk by David Beazley.

Sign up to request clarification or add additional context in comments.

3 Comments

I'll consider IronPython as for some reason I can't import multiprocessing module. However, as long as there's threading in CPython, then multithreading is possible. I believe there's a way out. Thanks.
I tried importing multiprocessing module but got this error message: 'no module named dummy'. However I checked the multiprocessing file at the python directory and found dummy. Any suggestions on what to do?
Multithreading is possible in CPython, but getting a speedup from multiple cores via multithreading is unlikely, due to the GIL.

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.