6

So what I am trying to do ultimately is read a line, do some calculations with the info in that line, then add the result to some global object, but I can never seem to get it to work. For instance, test is always 0 in the code below. I know this is wrong, and I have tried doing it other ways, but it still isn't working.

import multiprocessing as mp

File = 'HGDP_FinalReport_Forward.txt'
#short_file = open(File)
test = 0

def pro(temp_line):
    global test
    temp_line = temp_line.strip().split()
    test = test + 1
    return len(temp_line)

if __name__ == "__main__":
    with open("HGDP_FinalReport_Forward.txt") as lines:
        pool = mp.Pool(processes = 10)
        t = pool.map(pro,lines.readlines())
2
  • 2
    Globals are generally a sign that you are doing something wrong. I advise changing the way your program works to avoid them - it will save you headaches in the long run, and there is always a better way. Commented Jun 19, 2012 at 21:34
  • The point of the multiprocessing module is that it spawns child processes rather than threads in the same process, with all the usual tradeoffs. Unfortunately, the documentation doesn't explain those tradeoffs at all, assuming you'll already know them. If you follow all of the "Programming guidelines" in the documentation, you may get away with not understanding, but you really should learn. Commented Jun 19, 2012 at 23:14

2 Answers 2

17

The worker processes spawned by the pool get their own copy of the global variable and update that. They don't share memory unless you set that up explicitly. The easiest solution is to communicate the final value of test back to the main process, e.g. via the return value. Something like (untested):

def pro(temp_line):
    test = 0
    temp_line = temp_line.strip().split()
    test = test + 1
    return test, len(temp_line)

if __name__ == "__main__":
    with open("somefile.txt") as lines:
        pool = mp.Pool(processes = 10)
        tests_and_t = pool.map(pro,lines.readlines())
        tests, t = zip(*test_and_t)
        test = sum(tests)
Sign up to request clarification or add additional context in comments.

3 Comments

The key thing here is that, using multiprocessing, the threads (well, processes) don't share state.
+1 for the answer, and +1 @Lattyware. I wish the multiprocessing documentation were a little clearer on how "spawning processes using an API similar to the threading module" differs from "creating threads", because that would solve half the problems with the module on SO…
Great stuff! It helped me with updating django models. Apparently the connection isn't forked and can be closed improperly by another process. To take care of that I used this approach but I didn't use zip, I just accessed the tuple elements from the list directly using a for loop, and then for each list item going through the tuple using tuple_element[index].
0

Here is examples of using global variable within multiprocessing.

We can clearly see that each process works with its own copy of variable:

import multiprocessing
import time
import os
import sys
import random
def worker(a):
    oldValue = get()
    set(random.randint(0, 100))
    sys.stderr.write(' '.join([str(os.getpid()), str(a), 'old:', str(oldValue), 'new:', str(get()), '\n']))

def get():
    global globalVariable
    return globalVariable

globalVariable = -1
def set(v):
    global globalVariable
    globalVariable = v

print get()
set(-2)
print get()

processPool = multiprocessing.Pool(5)
results = processPool.map(worker, range(15))

Output:

27094 0 old: -2 new: 2 
27094 1 old: 2 new: 95 
27094 2 old: 95 new: 20 
27094 3 old: 20 new: 54 
27098 4 old: -2 new: 80 
27098 6 old: 80 new: 62 
27095 5 old: -2 new: 100 
27094 7 old: 54 new: 23 
27098 8 old: 62 new: 67 
27098 10 old: 67 new: 22 
27098 11 old: 22 new: 85 
27095 9 old: 100 new: 32 
27094 12 old: 23 new: 65 
27098 13 old: 85 new: 60 
27095 14 old: 32 new: 71

Comments

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.