There are unreasonably high values and also negative values inside the 'Net Entries' and 'Net Exits' columns. I am trying to fix it with the code above. But I am keep encountering the below error. Below is my code:
indexes = [*D.index.unique()]
list_ = []
for index in indexes :
df = D[D.index == index]
array_ent = np.array(df['Net Entries'])
array_ext = np.array(df['Net Exits'])
avg_ent = np.mean(array_ent[(array_ent > 0) & (array_ent < 5040)])
avg_ext = np.mean(array_ext[(array_ext > 0) & (array_ext < 5040)])
array_ent[(array_ent < 0) | (array_ent > 5040)] = avg_ent
array_ext[(array_ext < 0) | (array_ext > 5040)] = avg_ext
df['x'] = array_ent
df['y'] = array_ext
list_.append(df)
MTA = pd.concat(list_, axis = 0)
RuntimeWarning: Mean of empty slice.
return _methods._mean(a, axis=axis, dtype=dtype,
RuntimeWarning: invalid value encountered in double_scalars
ret = ret.dtype.type(ret / rcount)
Can anyone solve this problem ?
array_entorarray_extfulfill your conditions.groupby? A typical example of transforming a dataframe withgroupby: pandas.pydata.org/pandas-docs/stable/user_guide/…. Replacing with per-group mean can be done by usingwherepandas.pydata.org/docs/reference/api/…:lambda x: x.where((x < 0) | (x > 5040), x.mean()). One-liners withclipin answers here: stackoverflow.com/q/47187359.