1

I have an ordered list of pandas dataframes, each contains data for a given year:

yearly_data_list = [df_2000, df_2001, ... , df_2010]

These dataframes have the same shape, say R rows and C columns, and the same column names and index.

I want to create N = R*C time series of length len(yearly_data_list), one for each entry in the dataframes and put them into a nested dictionary where the final value is the list. For a given entry in the dataframes, the first value of the corresponding list would be the entry of df_2000, the second value the entry df_2001, etc.

For example, say I have N=3 dataframes with 2 rows (R=2) and 3 columns (C=3) each:

df_2000 = pd.DataFrame(data = [[12, 13, 14], [15, 16, 17]], columns=list('ABC'), index=list('xy'))

   A   B   C
x  12  13  14
y  15  16  17

df_2001 = pd.DataFrame(data = [[6, 7, 8], [9, 10, 11]], columns=list('ABC'), index=list('xy'))

   A   B   C
x  6   7   8
y  9  10  11

df_2002 = pd.DataFrame(data = [[0, 1, 2], [3, 4, 5]], columns=list('ABC'), index=list('xy'))

   A  B  C
x  0  1  2
y  3  4  5

In this case, I would like to obtain R*C=6 lists (time series) of length 3 and put the inside a nested dictionary that looks something like:

{'x': {'A': [12, 6, 0], 'B': [13, 7, 1], 'C': [14, 8, 2]},
 'y': {'A': [15, 9, 3], 'B': [16, 10, 4], 'C': [17, 11, 5]}}

Note that on the outermost level the keys come from the index and in the innermost level the keys come from the column names.

I know that with pd.Dataframe.T.to_dict() I could get 3 dictionaries that look like this but contain the individual values for each dataframe instead of the lists. In this SO answer they merge multiple dictionaries but I have not been able to make it work for my nested dictionary case.

Any help using the link's approach, or any other, would be greatly appreciated. Thanks

1 Answer 1

1

This can be done with concat and groupby:

(pd.concat([df_2000, df_2001, df_2002])
   .groupby(level=0).agg(list)
   .to_dict('index')
 )

Output:

{'x': {'A': [12, 6, 0], 'B': [13, 7, 1], 'C': [14, 8, 2]},
 'y': {'A': [15, 9, 3], 'B': [16, 10, 4], 'C': [17, 11, 5]}}
Sign up to request clarification or add additional context in comments.

1 Comment

Works great, thanks! Will accept it as soon as SO allows me to.

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.