Consider the following code:
>>> df = pd.DataFrame(np.random.randint(0, 4, 16).reshape(4, 4), columns=list('ABCD'))
... df
...
A B C D
0 2 1 0 2
1 3 0 2 2
2 0 2 0 2
3 2 1 2 0
>>> def grouper(frame):
... return frame
...
... df.groupby('A').apply(grouper)
...
A B C D
0 2 1 0 2
1 3 0 2 2
2 0 2 0 2
3 2 1 2 0
As you can see, the results are identical.
Here is the documentation of apply:
The function passed to apply must take a dataframe as its first argument and return a DataFrame, Series or scalar. apply will then take care of combining the results back together into a single dataframe or series. apply is therefore a highly flexible grouping method.
Groupby will divide group into small dataframes like this:
A B C D
2 0 2 0 2
A B C D
0 2 1 0 2
3 2 1 2 0
A B C D
1 3 0 2 2
apply documentation says that it combines the dataframes back into a single dataframe. I am curious how it combined them in a way that the final result is the same as the original dataframe. If it had used concat, the final dataframe would have been equal to:
A B C D
2 0 2 0 2
0 2 1 0 2
3 2 1 2 0
1 3 0 2 2
I am curious how this concatenation has been done.