I'm trying to use manager in Multiprocessing in Python to share a list (amongst other things) across processes. It would appear that the only way to change values in the list is to do it explicitly, i.e. list1[2] = list2[2]. This is a bit of a pain if the lists are long or complicated. I would like to just be able to say list1 = list2, where list1 is the shared object controlled by manager and list2 is a list in a worker process. It will let me append to list1 (so it doesn't have to be fixed length) but it won't let me clear list1 so list1.clear() list1.append(list2) won't work. Does anybody have a simple solution to this?
2 Answers
ISTR managers are slow, and less needed in newer versions of Python.
You can put an array of int's in shared memory using multiprocessing. That's likely what you're looking for - at least if your "lists" contain simple types.
From https://docs.python.org/3/library/multiprocessing.html :
from multiprocessing import Process, Value, Array
def f(n, a):
n.value = 3.1415927
for i in range(len(a)):
a[i] = -a[i]
if __name__ == '__main__':
num = Value('d', 0.0)
arr = Array('i', range(10))
p = Process(target=f, args=(num, arr))
p.start()
p.join()
print(num.value)
print(arr[:])
From there, try slicing assignment. I don't know for a fact that'll work, but it's the thing to try.
I tend to use queues though. They give loose coupling, and work better with more complex types. EG: to_network_queue: multiprocessing.Queue = multiprocessing.Queue(maxsize=max_messages)
Also, multiprocessing isn't as nice as concurrent.futures, which wraps multiprocessing and threading, and allows switching between processes and threads with a single-line change.
Here's what I mean by "slicing assignment":
$ /usr/local/cpython-3.9/bin/python3
below cmd output started 2020 Fri Oct 23 10:26:41 AM PDT
Python 3.9.0 (default, Oct 14 2020, 16:19:47)
[GCC 7.5.0] on linux
Type "help", "copyright", "credits" or "license" for more information.
>>> import array
>>> a = array.array('i', range(10))
>>> a
array('i', [0, 1, 2, 3, 4, 5, 6, 7, 8, 9])
>>> b = array.array('i', range(5, 15))
>>> b
array('i', [5, 6, 7, 8, 9, 10, 11, 12, 13, 14])
>>> a[:] = b[:]
>>> a
array('i', [5, 6, 7, 8, 9, 10, 11, 12, 13, 14])
>>>
1 Comment
Sorry if this should be a comment - comments seem to be very limited in length. So I have tried Array (what I am currently using) and sllcing didn't work. I still had to put things together entry by entry. Also, the input comes as a list of lists (i.e. a 2D array) and it isn't obvious to me how I can get Array to work with a 2D array, at least trivially.
I had thought of queues but the producer creates data far faster than the consumer takes it and the consumer only ever wants the latest data so I had gone with Arrays because that way I am overwriting the old data. However, I guess I could make the consumer pop off the latest data and then clear the queue (?). That would be a possibility.
I was unaware of concurrent.futures so will have to take a look at that