Use random value in range to change value in dataset

Question

I have a dataset that is full of NaN and outliner values.

I have managed to locate and replace this values by random number from certain range using:

dataset.loc[(dataset['MaxHR'] == 1000) & (dataset['Age'] < 50), 'MaxHR'] = random.randrange(138, 176)

My problem is that I meant that random.randrange(138, 176) pick one number for each time that number 1000 occur.

My code just picks a number and for every row and column that meet conditions assert one number for the whole column.

Perhaps it would make sense to generate a random number for each row in your dataframe and then swap out the matching rows (1000 & age<50) with the generated random number in that same row. — JNevill
– JNevill, Commented Dec 8, 2021 at 19:11

Basbeu · Accepted Answer · 2021-12-08 19:31:09Z

1

You can consider to use the applymap method. Here a simple example :

df = pd.DataFrame([[0, 1], [1, 2], [1, 3]], columns=['a', 'b'])
def clean(x):
    if x == 1:
        return random.randrange(138, 176)
    else:
        return x

df = df.applymap(lambda x: clean(x))

edited Dec 8, 2021 at 19:31

answered Dec 8, 2021 at 19:21

Basbeu

715 bronze badges

Sign up to request clarification or add additional context in comments.

Collectives™ on Stack Overflow

Use random value in range to change value in dataset

1 Answer 1

Comments

Your Answer

Linked

Hot Network Questions

Collectives™ on Stack Overflow

1 Answer 1

Comments

Your Answer

Sign up or log in

Post as a guest

Linked

Related