0

I have to loop for N times to calculate formulas and add results in dataframe. My code works and takes a few seconds to process each Item. However, it can only do one item at a time because I'm running the array through a for loop:

I try to update Code and I add numba library to optimise code

def calculationResults(myconfig,df_results,isvalid,dimension,....othersparams):
    for month in nb.prange(0, myconfig.len_production):   
        calculationbymonth(month,df_results,,....othersparams)
    return df_results

But it's still doing one item at a time? ANy Ideas?

1
  • 1
    We need more code to be able to understand what you're doing. I don't see any threading or multiprocessing being applied here. Commented May 25, 2022 at 7:37

1 Answer 1

1

We can use parallelized apply using the similar to below function.

def parallelize_dataframe(df, func, n_cores=4):
    df_split = np.array_split(df, n_cores)
    pool = Pool(n_cores)
    df = pd.concat(pool.map(func, df_split))
    pool.close()
    pool.join()
    return df
Sign up to request clarification or add additional context in comments.

1 Comment

I don't undestand your answer I have a function to calculate iteration for or N item and a second function to calculate each Item) If i need to improve perfermence with a Multiprocessing Pool to loop for many item in the same time I have a function with many arguments def calculationResults(myconfig,df_results,isvalid,dimension,....othersparams): for month in nb.prange(0, myconfig.len_production): calculationbymonth(month,df_results,,....othersparams) return df_results in wich function i need to add Multiprocessing Pool in the first function or in the second one?

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.