I have a (very) large multi-indexed dataframe with a single boolean column. for example:
bool_arr = np.random.randn(30)<0
df = pd.concat(3*[pd.DataFrame(np.random.randn(10, 3), columns=['A','B','C'])],
keys=np.array(['one', 'two', 'three']))
df['bool'] = bool_arr
df.index.rename(['Ind1', 'Ind2'], inplace=True)
I'm trying to set the boolean column to False on the 2 first & 2 last indices of each inner dataframe, but only if the 3rd (or 3rd to last) isn't True. Meaning, I want the first and last 3 boolean entries to be the same.
I can do this by iterating over each index-level, extracting the inner dataframes one by one and resetting the relevant values, then plugging the new Series back to a copy of the original dataframe. But this is very wasteful in both time & memory.
Is there a faster way of doing this?
(I should add that in my example all inner dataframes are of the same length, but that's not necessarily the case for me)