I have a large pandas dataframe df of something like a million rows and 100 columns, and I have to create a second dataframe df_n, same size as the first one. Several rows and columns of df_n will be equal to the same rows and columns of df. I have a mask m and list of columns l where df_n differs from df, and I have also a dataframe df_small of the differences, such that df_n[m][l] = df_small.
I would like to say then that df_n[~m][~l] = df[~m][~l]. In order to save memory, I want to avoid creating any intermediate copies of df. It is probably a trivial problem, but I am struggling to achieve it. The final result has to be that df_n references df for [~m][~l], and new memory is occupied by only df_small. How can this be done?
df_nreferencesdffor[~m][~l], and new memory is occupied by onlydf_small" Do you just wantdf_nto take on the values ofdffor all[~m][~l]? I do not know of a way to do this without allocating memory todf_n, so is that okay?