Consider the following:
fine = np.random.uniform(0,100,10)
fine[fine<20] = 0 # introduce some intermittency
coarse = np.sum(fine.reshape(-1,2),axis=1)
fine is a timeseries of magnitudes (e.g. volume of rainfall). coarse is the same timeseries but at a halved resolution, so every 2 timesteps in fine are aggregated to a single value in coarse.
I am then interested in the weighting that determines the proportions of the magnitude of coarse that corresponds to each timestep in fine for the instances where the value of coarse is above zero.
def w_xx(fine, coarse):
weights = []
for i, val in enumerate(coarse):
if val > 0:
w = fine[i*2:i*2+2]/val # returns both w1 and w2, w1 is 1st element, w2 = 1-w1 is second
weights.append(w)
return np.asarray(weights)
So w_xx(fine,coarse) would return an array of shape 5,2 where the elements of axis=1 are the weights of fine for a value of coarse.
This is all fine for smaller timeseries, but I'm running this analysis on ~60k-sized arrays of fine, plus in a loop of 300+ iterations.
I have been trying to make this run in parallel using the multiprocessing library in Python2.7 but I've not managed to get far. I need to be be reading both timeseries at the same time in order to get the corresponding values of fine for every value in coarse, plus to only work for values above 0, which is what my analysis requires.
I would appreciate suggestions on a better way to do this. I imagine if I can define a mapping function to use with Pool.map in multiprocessing, I should be able to parallelize this? I've only just started out with multiprocessing so I don't know if there is another way?
Thank you.