I have a dataFrame with columns Age, Salary and others, if I used:
df['Age'] = df['Age'].apply(lambda x : x+100 if x>30 else 0)
Then I can modify the Age column with the if else condition. Also, if I used:
df[['Age', 'Salary']] = df[['Age', 'Salary']].apply(lambda x : x+100)
Then, I can apply the lambda equation to each column independently. But as soon as I use an if else condition on both columns as:
df[['Age', 'Salary']] = df[['Age', 'Salary']].apply(lambda x : x+100 if x>30 else 0)
Then I get the following error: ValueError: The truth value of a Series is ambiguous. Use a.empty, a.bool(), a.item(), a.any() or a.all().
So, how can I modify the Age, Salary and n columns applying the same if else or other lambda condition to each column independently?
I know two possible solution are:
- To use a for to call each column:
cols = ['Age', 'Salary'] for i in cols: df[i] = df[i].apply(lambda x : x+100 if x>30 else 0)
- To use the apply on each column:
df['Age'] = df['Age'].apply(lambda x : x+100 if x>30 else 0) df['Salary'] = df['Salary'].apply(lambda x : x+100 if x>30 else 0)
Is there a way to do the same but only on one line (such as the code that I tried) using apply or other function?
df[['Age', 'Salary']] > 30and then fill the true/false values (e.g. with mask()). You can use pipe() to avoid retyping the column slice:df[['Age', 'Salary']].pipe(lambda x: x.mask(x > 30, x + 100))pipe()is not necessary. Masks that have a subset of the column index apply to only that subset in the target frame.