0

I have created a structure, which provided a dataset as opts to a ProcessPoolExecutor and the inputs are the indices for the dataset.

I could provide a MWE, but I tried several approaches and all resulted in something like a Forkbomb.

I think the cause was some internal multiprocess execution of pytorch, but Ian unable to provide some proof on this assumption.

So the question: Is there a way to go multiprocess in pytorch with data, which is not able to create a batch from (different shape), without creating a forkbomb?

The goal is to save data at the end of the process to disk, so I don't have to take care about sync (and the data is at the end too large to fit all in memory for all data).

1 Answer 1

0

It would be nice to have some code to try to replicate the issue.

I have two possible guesses for you:

It could be some interactions between the torch.multiprocess and the python futures, this could be happening because you have not pass any context (mp_context) Wich defaults to the multiprocess context when creating the pool. This might be breaking torch spawning. Try to set the context to the context of torch, which is returned by the spawn() call or you can do

ctx = pmp.get_context("spawn")

At the cost of performance, try to limit the pytprch threads set_num_thread to 1 or 2, same thing with the pool. When doing this monitor the memory usage,

I think that either the copy process of the python multiprocess is making internal torch values not change and continue to fork until the bomb. Or since it doesn't have access to spawn context for his queue the tensors are being pickled and your memory is exploding, killing some of the processes, and python multiprocess es doesn't kill other processes when one dies (I believe), torch does, so that might be causing the bomb/memory overload.

Updates with any results, and maybe some code?

Sign up to request clarification or add additional context in comments.

Comments

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.