I have a Python/Cython application, which is parallelized using OpenMP and which makes several calls to the Intel MKL. Usually, i determine the number of threads via OMP_NUM_THREADS=xx. Both the cython script as well as MKL (Pardiso solver calls) correctly start several threads when i run my script using a Anaconda distribution (Python 3.6). The CPU load and the number of loaded cores can be seen very well in the system monitor.
However, when using the systems Python distribution (Python 3.6 under Arch Linux), only one thread is started, for both the cython module as well as the Intel MKL.
At least for my cython module i can tell that the correct number of threads is requested (via prange() ), but just one thread is obtained.
No compilation errors arise, and of course flag '-fopenmp' is used for compilation. Since the issue affects both my cython module as well as the Intel MKL, i assume it is somehow related to my systems OpenMP. What is the issue here? Thank you!
OMP_NUM_THREADSsetting for the system that does not apply to anaconda. Try setting that env variable manually when running the python script.