Suppose I have a Pandas Dataframe in the format below based on a 'Dict of Dicts 'Group'list (also below)…
ITEMS={
“Item_group1”:{‘Stuff’:’Some stuf’
‘More Stuff’:’Extra Stuff’
Group:[[Iteration1, 18, 25,0], [Iteration1, 43, 67,1], [Iteration1, 87, 76,1],
[Iteration2, 45, 29,0], [Iteration2, 44, 77,1], [Iteration2, 43, 74,0]],
}
“Item_group2”:{‘Stuff’:’Some stuf’
‘More Stuff’:’Extra Stuff’
Group:[[Iteration1, 75, 564,0], [Iteration1, 21, 87,1], [Iteration1, 7, 5,1],
[Iteration2, 54, 24,0], [Iteration2, 7, 45,1], [Iteration2, 45, 745,0]],
}
DataFrame in the following format….
Iteration Value1 Value2 Feature Active
Iteration1 18 25 0
Iteration1 3 67 1
Iteration1 87 76 1
Iteration2 45 29 0
Iteration2 44 7 1
Iteration2 43 74 0
How would I go about separating and calculating the averages for each iteration based on ‘Feature Active’ == 1, and ignore any ‘Feature Active’ ==0 entries?
I have the following code to calculate the stats per iteration for Value1 and Value2 after separating for ‘Iteration’ and ‘Feature Active’ as the keys, but it displays ‘Feature Active’ ==0 which I am not concerned with.
FeatureAvgs = Item_group1_DF.groupby(['Iteration’,’Feature Active'])
print np.round(FeatureAvgs[['Value1','Value2']].describe(), decimals=1)
Which produces the following output… (ignore the actual figures, this was taken from another dataframe)
Iteration Feature Enabled
Iteration1 0 count 3672.0 3672.0
mean -1352.5 0.0
std 220.5 0.0
min -1920.0 0.0
25% -1507.2 0.0
50% -1267.0 0.0
75% -1184.0 0.0
max -785.0 0.0
1 count 580.0 580.0
mean -1368.6 -1394.5
std 151.5 157.7
min -1788.0 -1805.0
25% -1454.2 -1490.2
50% -1335.5 -1361.0
75% -1270.0 -1291.0
max -1045.0 -1033.0
Iteration2 0 count 20612.0 20612.0
mean -1073.5 0.0
std 142.3 0.0
min -1730.0 0.0
25% -1088.0 0.0
50% -1036.0 0.0
75% -1005.0 0.0
max -805.0 0.0
1 count 14718.0 14718.0
mean -1113.6 -1161.1
std 129.3 134.9
min -1773.0 -1818.0
25% -1151.0 -1214.0
50% -1095.0 -1122.0
75% -1043.0 -1075.0
max -832.0 -897.0
But I am just after the mean averages for when the feature is active (==1). Sorry for the long question, but I am new to Pandas, and still getting through the documentation
FeatureAvgs = Item_group1_DF[item_group1_DF['Feature Enabled'] == 1].groupby(['Iteration’,’Feature Active'])[['Value1','Value2']].mean()