6

If I have a function f that I am applying to more than once to a set of columns, what's a more Pythonic way of going about it. Right now, what I am doing is this.

newdf=df.groupby(['a', 'b']).apply(lambda x: f(x, 1))
newdf.columns=['1']
newdf['2']=df.groupby(['a', 'b']).apply(lambda x: f(x, 2))
newdf['3']=df.groupby(['a', 'b']).apply(lambda x: f(x, 3))
newdf['4']=df.groupby(['a', 'b']).apply(lambda x: f(x, 4))

Is there a better way of going about it?

Thanks,

4
  • 1
    Please provide a sample dataframe and expected output. Commented Jun 14, 2018 at 13:40
  • I deleted my answer too since I don't think it was pythonic enough, and pandas groupby can be tricky. I'll leave it here and say you can try newdf = pd.concat( df.groupby(['a', 'b']).apply(lambda x: f(x, i)) for i in range(1, 5), axis=1 ). And a sample dataframe would help. Commented Jun 14, 2018 at 13:42
  • 1
    @Linda Can you tell us what the function is doing? Commented Jun 14, 2018 at 13:45
  • I agree with others, if you can share a mcve it will be easy to help you. Depending on the function you can even get rid of the apply Commented Jun 14, 2018 at 13:53

4 Answers 4

2

That's pythonic enough for me:

columns_dict = dict()
for i in range(1, 5):
    columns_dict[str(i)] = df.groupby(["a", "b"]).apply(lambda x: f(x, i))

pd.DataFrame(columns_dict)
Sign up to request clarification or add additional context in comments.

Comments

1

You could do :

pandas.DataFrame([df.groupby(['a','b']).apply(lambda x : f(x,i)) for i in range(1,5)])

Then transpose the new DataFrame if you want to have same column names as the initial dataframe.

2 Comments

This is not significantly different by Jay Calamari's answer.
The Jay Calamari's soltution won't even work, as the `pandas.concat function needs arguments to be passed as a list. He is missing some brackets over there !
1

Use agg() to compute multiple values from a single groupby():

df.groupby(['a', 'b']).agg([
    ('1': lambda x: f(x, 1)),
    ('2': lambda x: f(x, 2)),
    ('3': lambda x: f(x, 3)),
    ('4': lambda x: f(x, 4)),
])

Or equivalently:

df.groupby(['a', 'b']).agg([(str(i), lambda x: f(x, i)) for i in range(1, 5)])

4 Comments

Does this work for df = pd.DataFrame({'a': [5], 'b':[3]}) and def f(x,i): return x**i ?
@astro123: I don't know, does it? Do I look like a Python interpreter?
obviously not. :) I was practicing that pandas example, did not work, and was wondering why it didn't work?
unfortunately .agg does not support the index through x.name
0

Pandas groupby.apply accepts arbitrary arguments and keyword arguments, which are passed on to the grouping function. In addition, you can create a dictionary mapping column to argument. Finally, you can also reuse a groupby object, which can be defined outside your loop.

argmap = {'2': 2, '3': 3, '4': 4}

grouper = df.groupby(['a', 'b'])

for k, v in argmap.items():
    newdf[k] = grouper.apply(f, v)

1 Comment

I think he wants multiple arguments to be passed to the apply function, rather than calling the apply function multiple times.. this is important when the two columns are required in a single invocation

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.