0

I have two programs, one written in C and one written in Python. I want to pass a few arguments to C program from Python and do it many times in parallel, because I have about 1 million of such C calls.

Essentially I did like this:

from subprocess import check_call
import multiprocessing as mp
from itertools import combinations

def run_parallel(f1, f2):
    check_call(f"./c_compiled {f1} {f2} &", cwd='.', shell=True)

if __name__ == '__main__':
        pairs = combinations(fns, 2)

        pool = mp.Pool(processes=32)
        pool.starmap(run_parallel, pairs)
        pool.close()

However, sometimes I get the following errors (though the main process is still running)

/bin/sh: fork: retry: No child processes

Moreover, sometimes the whole program in Python fails with

BlockingIOError: [Errno 11] Resource temporarily unavailable

I found while it's still running I can see a lot of processes spawned for my user (around 500), while I have at most 512 available.

This does not happen all the time (depending on the arguments) but it often does. How I can avoid these problems?

4
  • 2
    there is no c++ in this question, right? Commented Aug 15, 2019 at 15:39
  • Removed. The problem persisted if I do this to cpp file. Commented Aug 15, 2019 at 15:43
  • 1
    That the program you run is made in C is irrelevant for your problem. And if you "have about 1 million of such C calls" then running them as separate programs is very inefficient. If you have control over the program, then consider making it into a Python module that can be imported and then called like any normal Python function. Possibly using a thread pool for parallelism. Commented Aug 15, 2019 at 15:44
  • Indeed, if this is just your own C code you're better off wrapping it in Python in one of the myriad ways of doing so. I'd suggest Cython. Commented Aug 15, 2019 at 17:59

2 Answers 2

2

I'd wager you're running up against a process/file descriptor/... limit there.

You can "save" one process per invocation by not using shell=True:

check_call(["./c_compiled", f1, f2], cwd='.')

But it'd be better still to make that C code callable from Python instead of creating processes to do so. By far the easiest way to interface "random" C code with Python is Cython.

Sign up to request clarification or add additional context in comments.

Comments

0

"many times in parallel" you can certainly do, for reasonable values of "many", but "about 1 million of such C calls" all running at the same time on the same individual machine is almost surely out of the question.

You can lighten the load by running the jobs without interposing a shell, as discussed in @AKX's answer, but that's not enough to bring your objective into range. Better would be to queue up the jobs so as to run only a few at a time -- once you reach that number of jobs, start a new one only when a previous one has finished. The exact number you should try to keep running concurrently depends on your machine and on the details of the computation, but something around the number of CPU cores might be a good first guess.

Note in particular that it is counterproductive to have more jobs at any one time than the machine has resources to run concurrently. If your processes do little or no I/O then the number of cores in your machine puts a cap on that, for only the processes that are scheduled on a core at any given time (at most one per core) will make progress while the others wait. Switching among many processes so as to attempt to avoid starving any of them will add overhead. If your processes do a lot of I/O then they will probably spend a fair proportion of their time blocked on I/O, and therefore not (directly) requiring a core, but in this case your I/O devices may well create a bottleneck, which might prove even worse than the limitation from number of cores.

3 Comments

processes=32 in the Pool initializer basically limits things to running, well, 32 at a time here since check_call waits for the process to finish.
Fair enough, @AKX. In all likelihood, though, 32 is still more than is actually useful for the OP, and reducing that number may mitigate or even solve their issue, since they report observing process counts near their limit.
(Of course it's possible that the C program spawns a subprocess, disowns it, then exits, so it's no longer "tracked" by check_call.)

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.