Suppose we have a data frame
In [1]: df = pd.DataFrame(np.random.randint(0,100,size=(100, 4)), columns=list('ABCD'))
In [2]: df
Out[3]:
A B C D
0 45 88 44 92
1 62 34 2 86
2 85 65 11 31
3 74 43 42 56
4 90 38 34 93
5 0 94 45 10
.. .. .. .. ..
How can I randomly replace x% of all entries with a value, such as None?
In [4]: something(df, percent=25)
Out[5]:
A B C D
0 45 88 None 92
1 62 34 2 86
2 None None 11 31
3 74 43 None 56
4 90 38 34 None
5 None 94 45 10
.. .. .. .. ..
I've found information about sampling particular axes, and I can imagine a way of randomly generating integers within the dimensions of my data frame and setting those equal to None, but that doesn't feel very Pythonic.
- Edit: forgot 'way' in title