2

I have a dataframe df with two columns gender, score.

|---------------------|------------------|
|      gender         |     score        |
|---------------------|------------------|
|          male       |         34       |
|---------------------|------------------|
|          female     |         34       |
|---------------------|------------------|
|          male       |         34       |
|---------------------|------------------|
|          female     |         34       |
|---------------------|------------------|
|          male       |         34       |
|---------------------|------------------|

I want to change scores of males (gender == 'male') from row 3 to row 5 to be 0, expected output:

|---------------------|------------------|
|      gender         |     score        |
|---------------------|------------------|
|          male       |         34       |
|---------------------|------------------|
|          female     |         34       |
|---------------------|------------------|
|          male       |         0        |
|---------------------|------------------|
|          female     |         34       |
|---------------------|------------------|
|          male       |         0        |
|---------------------|------------------|

How can I combine iloc with that condition?

5
  • 1
    Your question is unclear. Please include an example of the input and the expected output. Commented Jul 8, 2018 at 20:56
  • You should use loc not iloc here. df.loc[df.gender == 'male', 'score'] = 0 Commented Jul 8, 2018 at 21:04
  • @AntonvBR I don't want to change the first row while the gender is male. Commented Jul 8, 2018 at 21:06
  • @Harold My bad. Commented Jul 8, 2018 at 21:20
  • Is "3 to 5" a hardwired condition? Or do you believe that you have a case of replicated data? Commented Jul 8, 2018 at 21:21

2 Answers 2

1

Alt1:

You could do it with two masks (conditions). This should be readable and make sense.

m1 = (df.gender == 'male')
m2 = (df.gender.duplicated())

df.loc[m1&m2, 'score'] = 0

Alt2:

Slice away the first truth value of the nonzero mask (requires import numpy as np). This should be faster.

m = np.nonzero(df.gender=='male')[0][1:]
df.loc[m, 'score'] = 0

Full example:

import pandas as pd
import numpy as np

df = pd.DataFrame({
    'gender': ['male','female','male','female','male'],
    'score': 34
})

m1 = (df.gender == 'male')
m2 = (df.gender.duplicated())

m = np.nonzero(df.gender=='male')[0][1:]
df.loc[m, 'score'] = 0

print(df)

Returns:

   gender  score
0    male     34
1  female     34
2    male      0
3  female     34
4    male      0
Sign up to request clarification or add additional context in comments.

Comments

0

I think you need,

m=df.loc[2:5,:].loc[df['gender']=='male']
df.loc[m.index,'score']=0
print(df)
    gender  score
0   male    34
1   female  34
2   male    0
3   female  34
4   male    0

Comments

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.