1

I have the following data frame:

import pandas as pd

units = [1, 1, 1, 5, 5, 5]
locations = [30, 30, 30, 32, 32, 32]
timestamps = [1, 2, 3, 1, 2, 3]
quantities = [1, 5, 3, 10, 35, 39]
data = {'units': units, 'locations': locations, 'timestamps': timestamps,
        'quantities': quantities}
df = pd.DataFrame(data=data)

that looks like this:

🐍 >>> df
   units  locations  timestamps  quantities
0      1         30           1           1
1      1         30           2           5
2      1         30           3           3
3      5         32           1          10
4      5         32           2          35
5      5         32           3          39

I need to get a list of data frames from all unique combinations of units and locations, i.e. something that uses df.groupby(['units', 'locations']). The end result should look something like this:

(1, 30)
   timestamps  quantities
0           1           1
1           2           5
2           3           3

(5, 32)
   timestamps  quantities
3           1          10
4           2          35
5           3          39

is this possible, please?

3 Answers 3

2

Run a dictionary comprehension through the groupby. You can read up more on this on the Pandas doc for groupby:split-apply-combine page:

d = {name:group.filter(['timestamps','quantities']) 
     for name, group in df.groupby(['units','locations'])}

#print(d.keys())
#dict_keys([(1, 30), (5, 32)])

print(d[(1,30)])

    timestamps  quantities
0       1           1
1       2           5
2       3           3

 print(d[(5,32)])

  timestamps    quantities
3       1          10
4       2          35
5       3          39
Sign up to request clarification or add additional context in comments.

Comments

1

another method is to use dict comp with groupby and concat

d = pd.concat(({combo : data for combo,data in df.groupby(['units','locations'])}))

print(d)

        units  locations  timestamps  quantities
1 30 0      1         30           1           1
     1      1         30           2           5
     2      1         30           3           3
5 32 3      5         32           1          10
     4      5         32           2          35
     5      5         32           3          39

Comments

0

you are right that it is just groupby:

cols = ['units','locations']
for k, d in df.drop(cols, axis=1).groupby([df[c] for c in cols]):
    print(k)
    print(d)

Output:

(1, 30)
   timestamps  quantities
0           1           1
1           2           5
2           3           3
(5, 32)
   timestamps  quantities
3           1          10
4           2          35
5           3          39

Comments

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.