18

Say I have the following dataframe, and want to group-by the ys:

   xs  ys
0   0   0
1   1   0
2   2   1
3   3   1

I can do this by running

grouped = df.groupby('ys')

I can iterate through this new groupby object fine, but instead I want a list of the dataframes that are accessed by group in the following loop:

for name, group in grouped:
    do_something(group)

Is this possible?

2
  • You want a list that is accessed by a group name? That's impossible. You probably want a dict. Commented Jan 16, 2017 at 17:26
  • No, I just want a list of the dataframes and don't care about the names. Just realised my loop was wrong, should be corrected now Commented Jan 16, 2017 at 17:28

1 Answer 1

28

Sure, just iterate over the groups!

>>> import pandas as pd, numpy as np
>>> df = pd.DataFrame(dict(xs=list(range(4)), ys=[0,0,1,1]))
>>> df
   xs  ys
0   0   0
1   1   0
2   2   1
3   3   1
>>> grouped = df.groupby('ys')
>>> dataframes = [group for _, group in grouped]
>>> dataframes
[   xs  ys
0   0   0
1   1   0,    xs  ys
2   2   1
3   3   1]
>>>
Sign up to request clarification or add additional context in comments.

3 Comments

Of course... almost answered myself in the question! Thanks for the answer!
can smn explain [group for _, group in grouped] how it works?
@bcikili, it is equivalent to [group for name, group in grouped]. The underscore is used to emphasize that we don't care about the name.

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.