2

I have a dataset that is full of NaN and outliner values.

I have managed to locate and replace this values by random number from certain range using:

dataset.loc[(dataset['MaxHR'] == 1000) & (dataset['Age'] < 50), 'MaxHR'] = random.randrange(138, 176)

My problem is that I meant that random.randrange(138, 176) pick one number for each time that number 1000 occur.

My code just picks a number and for every row and column that meet conditions assert one number for the whole column.

2
  • What do you want to happen instead? Commented Dec 8, 2021 at 19:06
  • Perhaps it would make sense to generate a random number for each row in your dataframe and then swap out the matching rows (1000 & age<50) with the generated random number in that same row. Commented Dec 8, 2021 at 19:11

1 Answer 1

1

You can consider to use the applymap method. Here a simple example :

df = pd.DataFrame([[0, 1], [1, 2], [1, 3]], columns=['a', 'b'])
def clean(x):
    if x == 1:
        return random.randrange(138, 176)
    else:
        return x

df = df.applymap(lambda x: clean(x))
Sign up to request clarification or add additional context in comments.

Comments

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.