2

I have a uniform distribution in a pandas dataframe column with a few NaN values I'd like to replace.

Since the data is uniformly distributed, I decided that I would like to fill the null values with random uniform samples drawn from a range of the column's min and max values. I used the following code to get the random uniform sample:

df_copy['ep'] = df_copy['ep'].fillna(value=np.random.uniform(3, 331))

Of course, using pd.DafaFrame.fillna() replaces all existing NaNs with the same value. I would like each NaN to be a different value. I assume that a for loop could get the job done, but am unsure how to create such a loop to specifically handle these NaN values. Thanks for the help!

1
  • Please look up df.where Commented Apr 12, 2019 at 16:11

2 Answers 2

1

If looks like you are doing this on a series (column), but the same implementation would work on a DataFrame:

Sample Data:

series = pd.Series(range(100))

series.loc[2] = np.nan
series.loc[10:15] = np.nan

Solution:

series.mask(series.isnull(), np.random.uniform(3, 331, size=series.shape))
Sign up to request clarification or add additional context in comments.

Comments

1

Use boolean indexing with DataFrame.loc:

m = df_copy['ep'].isna()

df_copy.loc[m, 'ep'] = np.random.uniform(3, 331, size=m.sum())

1 Comment

Thank you for the help!

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.