2

Consider the following code:

>>> df = pd.DataFrame(np.random.randint(0, 4, 16).reshape(4, 4), columns=list('ABCD'))
... df
...
   A  B  C  D
0  2  1  0  2
1  3  0  2  2
2  0  2  0  2
3  2  1  2  0
>>> def grouper(frame):
...     return frame
...     
... df.groupby('A').apply(grouper)
...
   A  B  C  D
0  2  1  0  2
1  3  0  2  2
2  0  2  0  2
3  2  1  2  0

As you can see, the results are identical. Here is the documentation of apply:

The function passed to apply must take a dataframe as its first argument and return a DataFrame, Series or scalar. apply will then take care of combining the results back together into a single dataframe or series. apply is therefore a highly flexible grouping method.

Groupby will divide group into small dataframes like this:

   A  B  C  D
2  0  2  0  2

   A  B  C  D
0  2  1  0  2
3  2  1  2  0

   A  B  C  D
1  3  0  2  2

apply documentation says that it combines the dataframes back into a single dataframe. I am curious how it combined them in a way that the final result is the same as the original dataframe. If it had used concat, the final dataframe would have been equal to:

   A  B  C  D
2  0  2  0  2
0  2  1  0  2
3  2  1  2  0
1  3  0  2  2

I am curious how this concatenation has been done.

1 Answer 1

4

If you look at the source code you will see that there is a parameter not_indexed_same that checks if the index remains the same after groupby. If it is the same then groupby does reindexing of the dataframe before returning results. I do not know why this was implemented.

The change was made on Aug 21, 2011 and Wes made no comments on the change: https://github.com/pandas-dev/pandas/commit/00c8da0208553c37ca6df0197da431515df813b7#diff-720d374f1a709d0075a1f0a02445cd65

Sign up to request clarification or add additional context in comments.

Comments

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.