0
import random
import os
from multiprocessing import Process

num = random.randint(0, 100)

def show_num():
    print("pid:{}, num is {}".format(os.getpid(), num))

if __name__ == '__main__':
    print("pid:{}, num is {}".format(os.getpid(), num))
    p = Process(target=show_num)
    p.start()
    p.join()
    print('Parent Process Stop')

The above code shows the basic usage of creating a process. If I run this script in the windows environment, the variable num is different in the parent process and child process. However, the variable num is the same when the script run between the Linux environment. I understand their mechanism of creating process is different. For example, the windows system doesn't have fork method. But, Can someone give me a more detailed explanation of their difference? Thank you very much.

1 Answer 1

1

The difference explaining the behavior described in your post is exactly what you mentioned: the start method used for creating the process. On Unix-style OSs, the default is fork. On Windows, the only available option is spawn.

fork
As described in the Overview section of this Wiki page (in a slightly different order):

The fork operation creates a separate address space for the child. The child process has an exact copy of all the memory segments of the parent process.

The child process calls the exec system call to overlay itself with the other program: it ceases execution of its former program in favor of the other.

This means that, when using fork, the child process already has the variable num in its address space and uses it. random.randint(0, 100) is not called again.

spawn
As the multiprocessing docs describe:

The parent process starts a fresh python interpreter process.

In this fresh interpreter process, the module from which the child is spawned is executed. Oversimplified, this does python.exe your_script.py a second time. Hence, a new variable num is created in the child process by assigning the return value of another call to random.randint(0, 100) to it. Therefore it is very likely, that the content of num differs between the processes.
This is, by the way, also the reason why you absolutely need to safeguard the instantiation and start of a process with the if __name__ == '__main__' idiom when using spawn as start method, otherwise you end up with:

RuntimeError: 
        An attempt has been made to start a new process before the
        current process has finished its bootstrapping phase.

        This probably means that you are not using fork to start your
        child processes and you have forgotten to use the proper idiom
        in the main module:

            if __name__ == '__main__':
                freeze_support()
                ...

        The "freeze_support()" line can be omitted if the program
        is not going to be frozen to produce an executable.

You can use spawn in POSIX OSs as well, to mimic the behavior you have seen on Windows:

import random
import os
from multiprocessing import Process, set_start_method
import platform

num = random.randint(0, 100)

def show_num():
    print("pid:{}, num is {}".format(os.getpid(), num))

if __name__ == '__main__':
    print(platform.system())
    # change the start method for new processes to spawn
    set_start_method("spawn")
    print("pid:{}, num is {}".format(os.getpid(), num))
    p = Process(target=show_num)
    p.start()
    p.join()
    print('Parent Process Stop')

Output:

Linux
pid:26835, num is 41
pid:26839, num is 13
Parent Process Stop
Sign up to request clarification or add additional context in comments.

Comments

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.