8
import pandas as pd

Let's say I have a dataframe like so:

df = pd.DataFrame({"a":range(4),"b":range(1,5)})

it looks like this:

   a  b
0  0  1
1  1  2
2  2  3
3  3  4

and a function that multiplies X by Y:

def XtimesY(x,y):
    return x*y

If I want to add a new pandas series to df I can do:

df["c"] =df.apply( lambda x:XtimesY(x["a"],2), axis =1)

It works !

Now I want to add multiple series:

I have this function:

def divideAndMultiply(x,y):
    return x/y, x*y

something like this ?:

df["e"], df["f"] = df.apply( lambda x: divideAndMultiply(x["a"],2) , axis =1)

It doesn't work !

I want the 'e' column to receive the divisions and 'f' column the multiplications !

Note: This is not the code I'm using but I'm expecting the same behavior.

1

4 Answers 4

21

Almost there. Use zip* to unpack the function. Try this:

def divideAndMultiply(x,y):
    return x/y, x*y

df["e"], df["f"] = zip(*df.a.apply(lambda val: divideAndMultiply(val,2)))
Sign up to request clarification or add additional context in comments.

Comments

13

UPDATE

Updated for version 0.23 - using result_type='broadcast' for further details refer to documentation

Redefine your function like this:

def divideAndMultiply(x,y):
    return [x/y, x*y]

Then do this:

df[['e','f']] = df.apply(lambda x: divideAndMultiply(x["a"], 2), axis=1, result_type='broadcast')

You shall get the desired result:

In [118]: df
Out[118]:
   a  b  e  f
0  0  1  0  0
1  1  2  0  2
2  2  3  1  4
3  3  4  1  6

6 Comments

I've seen this answer multiple times, but any time I've tried, I get KeyError: "['e', 'f'] not in index. I think pandas must have changed, does it still work for you @Abbas?
It still works. Follow the question and answer to reproduce the results.
repl.it/@seaders/SuperbIncompatibleAudacity it doesn't, not on Python 3.6 and pandas 0.23.1 - KeyError.
@seaders you are right, this answer doesn't work in 0.23.1 & this answer works stackoverflow.com/a/36600318/1437877
Good stuff @Abbas, I just wanted to make sure I wasn't going crazy. I can't find anywhere in the docs that this was the correct way of doing things, to then be able to see they've changed it, so it's all pretty unclear!
|
0
df["e"], df["f"] = zip(*df.apply( lambda x: divideAndMultiply(x["a"],2) , axis =1))

Should do the trick.

(I show this example so you can see how to use multiple columns as the input to create multiple new columns)

Comments

0

the following solution to this frustratingly frustrating question works for me. I found the original suggestion in another StackOverflow post a while ago. The trick is to wrap up the return values into a Series like this:

def divideAndMultiply(x,y):
    return pd.Series([x/y, x*y])

Then this works as you wanted:

df[['e','f']] = df.apply( lambda x: divideAndMultiply(x["a"],2) , axis =1)

Comments

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.