The Python code file is provided below. I'm using Python 3.10.12 on a Linux mint 21.3 (in case any of these info are needed). The one with a pool of 2 workers takes more time than the one without any multiprocessing. What am I doing wrong here?
import multiprocessing
import time
import random
def fun1( x ):
y = 0
for i in range( len( x ) ):
y = y + x[i]
return( y )
def fun2( x ):
p = multiprocessing.Pool( 2 )
y1, y2 = p.map( fun1, [ x[ : len( x ) // 2 ], x[ len( x ) // 2 : ] ] )
y = y1 + y2
return( y )
x = [ random.random() for i in range( 10 ** 6 ) ]
st = time.time()
ans = fun1( x )
et = time.time()
print( f"time = {et - st}, answer = {ans}." )
st = time.time()
ans = fun2( x )
et = time.time()
print( f"time = {et - st}, answer = {ans}." )
x = [ random.random() for i in range( 10 ** 7 ) ]
st = time.time()
ans = fun1( x )
et = time.time()
print( f"time = {et - st}, answer = {ans}." )
st = time.time()
ans = fun2( x )
et = time.time()
print( f"time = {et - st}, answer = {ans}." )
Here is what I get in terminal.
time = 0.043381452560424805, answer = 499936.40420325665.
time = 0.1324300765991211, answer = 499936.40420325927.
time = 0.4444568157196045, answer = 5000677.883536603.
time = 0.8388040065765381, answer = 5000677.883536343.
I also used the if __name__ == '__main__': line after fun2 and before the rest, I get the same results on terminal. I also tried Python 3.6.2 on a Codio server. I got similar timings.
time = 0.048882484436035156, answer = 499937.07655266096.
time = 0.15220355987548828, answer = 499937.0765526707.
time = 0.4848289489746094, answer = 4999759.127770024.
time = 1.4035391807556152, answer = 4999759.127769606.
I guess it is something wrong with what I'm doing in my code, a misunderstanding on how to use the multiprocessing.Pool rather than Python, but can't think of what. Any help will be appreciated. I expect using two workers I get a speed up by factor two, not a speed down. Also if needed, I checked with multiprocessing.cpu_count(), the Codio server has 4 and my computer has 12 cpus.

time.sleep(0.001)(perhaps with a smaller input size) after each iteration infun1- suddenly you'll see the pool finishing fasterfun1: gist.githubusercontent.com/IISResetMe/…sleep()isn't CPU-bound, so using multiple processors in this case won't give you anything other than more scheduling overhead :) For a CPU-bound workload, do something that requires many CPU instructions in quick succession: a simple (=unoptimized) primality test over large integers, multiplying big matrices, long division to approximate the first 10000 digits of phi, etc.