I have a data frame such that the variables in the first n columns are the same over, for instance, 2 rows and I would like to aggregate over the renaming columns that are of type float. Here is an example:
import pandas as pd
import numpy as np
data=[[1,2,np.nan,'string', 100, 200],[1,2,np.nan,'string',102,202],[1,2,5,0.5,1000,2000],[1,2,5,0.5,1002,2002]]
pd.DataFrame(data=data,columns=['Var1','Var2','Var3','Var4','Var5','Var6'])
Var1 Var2 Var3 Var4 Var5 Var6
0 1 2 NaN string 100 200
1 1 2 NaN string 102 202
2 1 2 5.0 0.5 1000 2000
3 1 2 5.0 0.5 1002 2002
So in this data frame, I would like to find the average of Var5 and Var6 over each 2 rows. The intended output would be the following:
Var1 Var2 Var3 Var4 Var5 Var6
0 1 2 NaN string 101 201
1 1 2 5.0 0.5 1001 2001
Is there a way to do this given data types of the same features are not consistent? For instance, Var3 can be nan and a float.