1

I am trying to obtain a list from a Dataframe based on a common value of the index.

In the example below I am trying to obtain the lists for 'type' and 'xx' based on 'date'.

Here is the Dataframe:

import pandas as pd
import numpy as np

idx = [np.array(['Jan', 'Jan', 'Feb', 'Mar', 'Mar', 'Mar']),np.array(['A1', 'A2', 'A2', 'A1', 'A3', 'A4'])]
data = [{'xx': 1}, {'xx': 5}, {'xx': 3}, {'xx': 2}, {'xx': 7}, {'xx': 3}]
df = pd.DataFrame(data, index=idx, columns=['xx'])
df.index.names=['date','type']
df.reset_index(inplace=True)
df=df.set_index(['date'])

Which looks like this:

      type  xx
date        
 Jan    A1   1
 Jan    A2   5
 Feb    A2   3
 Mar    A1   2
 Mar    A3   7
 Mar    A4   3

What I am trying to do is to create these two lists:

#list_type
[['A1', 'A2'], ['A2'], ['A1', 'A3', 'A4']]

#list_xx
[['1', '5'], ['3'], ['2', '7', '3']]

As you can see, the elements of the lists are constructed based on a common date.

I would really value an efficient way of doing this in Python.

1 Answer 1

4

Use GroupBy.agg with list and then convert DataFrame to dictionary of lists by DataFrame.to_dict:

d = df.groupby(level=0, sort=False).agg(list).to_dict('l')
print (d)
{'type': [['A1', 'A2'], ['A2'], ['A1', 'A3', 'A4']], 'xx': [[1, 5], [3], [2, 7, 3]]}

print (d['type'])
[['A1', 'A2'], ['A2'], ['A1', 'A3', 'A4']]

print (d['xx'])
[[1, 5], [3], [2, 7, 3]]
Sign up to request clarification or add additional context in comments.

Comments

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.