Pandas return value from multiple columns if equal to value in another column

Question

I have a Pandas dataframe like this:

  A      B        C         D
0 month   month+1 quarter+1 season+1
1 season  month+5 quarter+3 season+2
2 day     month+1 quarter+2 season+1
3 year    month+3 quarter+4 season+2
4 quarter month+2 quarter+1 season+1
5 month   month+4 quarter+1 season+2

I would like to insert a new column called 'E' based on several IF conditions. If column 'A' equals 'month' then return values in 'B', if column 'A' equals 'quarter' then return values in 'C', if column 'A' equals 'season' then return values in 'D', and if not then return values in column 'A'

  A      B        C         D        E
0 month   month+1 quarter+1 season+1 month+1
1 season  month+5 quarter+3 season+2 season+2
2 day     month+1 quarter+2 season+1 day
3 year    month+3 quarter+4 season+2 year
4 quarter month+2 quarter+1 season+1 quarter+1
5 month   month+4 quarter+1 season+2 month+4

I am having trouble doing this. I have tried playing around with a function but it did not work. See my attempt:

def f(row):
    if row['A'] == 'month':
        val = ['B']
    elif row['A'] == 'quarter':
        val = ['C']
    elif row['A'] == 'season':
        val = ['D']
    else:
        val = ['A']
    return val

df['E'] = df.apply(f, axis=1)

EDITED: to change the last else to column 'A'

ansev · Accepted Answer · 2020-01-14 11:36:35Z

4

Frist, I recommend you see: When should I want to use apply() in my code.

I would use Series.replace

df['E'] = df['A'].replace(['month','quarter','season'],
                          [df['B'], df['C'], df['D']])

or numpy.select

cond = [df['A'].eq('month'), df['A'].eq('quarter'), df['A'].eq('season')]
values= [df['B'], df['C'], df['D']]
df['E']=np.select(cond,values,default=df['A'])

  A      B        C         D        E
0 month   month+1 quarter+1 season+1 month+1
1 season  month+5 quarter+3 season+2 season+2
2 day     month+1 quarter+2 season+1 day
3 year    month+3 quarter+4 season+2 year
4 quarter month+2 quarter+1 season+1 quarter+1
5 month   month+4 quarter+1 season+2 month+4

edited Jan 14, 2020 at 11:36

answered Jan 14, 2020 at 10:42

ansev

31k5 gold badges21 silver badges33 bronze badges

Sign up to request clarification or add additional context in comments.

Comments

ansev · Accepted Answer · 2020-01-14 11:21:22Z

3

Just use np.select

c1 = df['A'] == 'month'
c2 = df['A'] == 'quarter'
c3 = df['A'] == 'season'

df['E'] = np.select([c1, c2, c3], [df['B'], df['C'], df['D']], df['A'])

Out[271]:
         A        B          C         D          E
0    month  month+1  quarter+1  season+1    month+1
1   season  month+5  quarter+3  season+2   season+2
2      day  month+1  quarter+2  season+1        day
3     year  month+3  quarter+4  season+2       year
4  quarter  month+2  quarter+1  season+1  quarter+1
5    month  month+4  quarter+1  season+2    month+4

edited Jan 14, 2020 at 11:21

ansev

31k5 gold badges21 silver badges33 bronze badges

answered Jan 14, 2020 at 10:42

Andy L.

25.3k4 gold badges20 silver badges30 bronze badges

4 Comments

Andy L. Over a year ago

@ansev: He mentioned that his code doesn't work. His desired ouput is what he wants. My output matches his desired output. You solution is actually wrong because you base on his incomplete codes

Andy L. Over a year ago

@ansev: I quote OP words : I have tried playing around with a function but it did not work. See my attempt. HIs description: ..... if not then return values in column 'A'.....

ansev Over a year ago

True, he wrote else ['D'] but he said that by default it should be 'A', I didn't see it, +1

ansev Over a year ago

I would use df['A'] instead df.A, because OP could try df.T

Jimmar · Accepted Answer · 2020-01-14 10:51:18Z

1

You probably need to fix your code like this:

def f(row):
    if row['A'] == 'month':
        val = row['B']
    elif row['A'] == 'quarter':
        val = row['C']
    elif row['A'] == 'season':
        val = row['D']
    else:
        val = row['D']
    return val

df['E'] = df.apply(f, axis=1)

note: you forgot to include row

val = ['B'] # before
val = row['B'] # after

Edit: This is just to point out the problem in the code, for better approaches check out the other answers related to the usage of numpy.select

edited Jan 14, 2020 at 10:51

answered Jan 14, 2020 at 10:35

Jimmar

4,4892 gold badges33 silver badges47 bronze badges

1 Comment

Jimmar Over a year ago

I just showed what was wrong in the code, wasn't trying to optimize

Collectives™ on Stack Overflow

Pandas return value from multiple columns if equal to value in another column

3 Answers 3

Comments

4 Comments

1 Comment

Your Answer

Linked

Hot Network Questions

Collectives™ on Stack Overflow

3 Answers 3

Comments

4 Comments

1 Comment

Your Answer

Sign up or log in

Post as a guest

Linked

Related