I am trying to parallelize operations on a big array. I summarized my approach in the code snippet below. Since the operations on the big array are costly, of the 100 processes, I want to parallelize 4 (i.e. n_cpus) at each iteration. After an iteration is finished, some garbage collection will be done and the next iteration should start. The main loop does the first iteration and terminates. I will be glad if some parallel processing expert can point out how I can correct my code to achieve the desired task.
from multiprocessing import Process
def train_model(model, big_array, i):
model = do_operations_on(big_array)
# edit: this part is within a class
n_processes = 100
n_cpus = 4
models = [None for _ in range(n_processes)]
n_iterations = n_processes / n_cpus
for it in range(n_iterations):
procs = [Process(target=train_model, \
args=(models[it*n_cpus+i], big_array, i)) for i in range(n_cpus)]
for p in procs: p.start()
for p in procs: p.join()