Converting a dictionary of lists to a pandas.DataFrame using predefined headers

Question

I have a dictionary that looks like the following:

date_pair_dict = {

    "15-02-2022 15-02-2022": ["key 1 val 1", "key 1 val 2", "key 1 val 3"],
    "15-02-2022 16-02-2022": ["key 2 val 1", "key 2 val 2", "key 2 val 3"],
    "16-02-2022 16-02-2022": ["key 3 val 1", "key 3 val 2", "key 3 val 3"],
    "16-02-2022 17-02-2022": ["key 4 val 1", "key 4 val 2", "key 4 val 3"]

}

And a list of headers:

headers = ["date pair header", "header val 1", "header val 2", "header val 3"]

I would like to create a pandas.DataFrame and write this to Excel, where the format would be the following expected output:

date pair header	header val 1	header val 2	header val 3
15-02-2022 15-02-2022	key 1 val 1	key 1 val 2	key 1 val 3
15-02-2022 16-02-2022	key 2 val 1	key 2 val 2	key 2 val 3
16-02-2022 16-02-2022	key 3 val 1	key 3 val 2	key 3 val 3
16-02-2022 17-02-2022	key 4 val 1	key 4 val 2	key 4 val 3

Right now, I'm using this (arguably very sad) method:

import pandas

date_pair_dict = {

    "15-02-2022 15-02-2022": ["key 1 val 1", "key 1 val 2", "key 1 val 3"],
    "15-02-2022 16-02-2022": ["key 2 val 1", "key 2 val 2", "key 2 val 3"],
    "16-02-2022 16-02-2022": ["key 3 val 1", "key 3 val 2", "key 3 val 3"],
    "16-02-2022 17-02-2022": ["key 4 val 1", "key 4 val 2", "key 4 val 3"]

}

headers = ["date pair header", "header val 1", "header val 2", "header val 3"]

list_of_keys, list_of_val_1, list_of_val_2, list_of_val_3 = [], [], [], []

for key in date_pair_dict.keys():

    list_of_keys.append(key)

    val_1, val_2, val_3 = date_pair_dict.get(key)

    list_of_val_1.append(val_1)
    list_of_val_2.append(val_2)
    list_of_val_3.append(val_3)

dataframe = pandas.DataFrame(
    {
        headers[0]: list_of_keys,
        headers[1]: list_of_val_1,
        headers[2]: list_of_val_2,
        headers[3]: list_of_val_3,
    }
)

Which is not scalable whatsoever. In reality, this date_pair_dict can have any number of keys, each corresponding to a list of any length. The length of these lists will however always remain the same, and will be known beforehand (I will always predefine the headers list).

Additionally, I believe this runs the risk of me having a dataframe that does not share the same order as the original keys, due to me doing the following:

for key in dictionary.keys():

    ....

The keys are date pairs, and need to remain in order when used as the first column of the dataframe.

Is there a better way to do this, preferably using a dictionary comprehension?

Tranbi · Accepted Answer · 2022-06-29 14:33:24Z

3

Like you said you can use a comprehension on your dict key/value pairs:

import pandas as pd

date_pair_dict = {

    "15-02-2022 15-02-2022": ["key 1 val 1", "key 1 val 2", "key 1 val 3"],
    "15-02-2022 16-02-2022": ["key 2 val 1", "key 2 val 2", "key 2 val 3"],
    "16-02-2022 16-02-2022": ["key 3 val 1", "key 3 val 2", "key 3 val 3"],
    "16-02-2022 17-02-2022": ["key 4 val 1", "key 4 val 2", "key 4 val 3"]

}

headers = ["date pair header", "header val 1", "header val 2", "header val 3"]

df = pd.DataFrame([[k] + v for k,v in date_pair_dict.items()], columns=headers)
print(df)

Output:

        date pair header header val 1 header val 2 header val 3
0  15-02-2022 15-02-2022  key 1 val 1  key 1 val 2  key 1 val 3
1  15-02-2022 16-02-2022  key 2 val 1  key 2 val 2  key 2 val 3
2  16-02-2022 16-02-2022  key 3 val 1  key 3 val 2  key 3 val 3
3  16-02-2022 17-02-2022  key 4 val 1  key 4 val 2  key 4 val 3

answered Jun 29, 2022 at 14:33

Tranbi

12.8k6 gold badges19 silver badges39 bronze badges

Sign up to request clarification or add additional context in comments.

3 Comments

ChaddRobertson Over a year ago

I don't use dictionaries often, will using .items() in this way maintain the order of the keys and their values (as defined in my date_pair_dict) when converting it to a dataframe?

Sebastian Wozny Over a year ago

Depends on the version of Python that is used. You should always reorder the resulting dataframe explicitly.

ChaddRobertson Over a year ago

Understood, thank you for the helping hand. Maintaining the order of the keys is a secondary problem in any case, and should likely be dealt with as such. I wasn't at all aware that different versions of Python treated dicts differently.

fsimonjetz · Accepted Answer · 2022-06-29 14:44:42Z

1

IIUC, you can simply do:

df = pd.DataFrame.from_dict(date_pair_dict, orient='index').reset_index()
df.columns = headers

It's always a good idea to exploit existing methods before trying to come up with your own constructor.

edited Jun 29, 2022 at 14:44

answered Jun 29, 2022 at 14:34

fsimonjetz

5,7923 gold badges7 silver badges23 bronze badges

1 Comment

ChaddRobertson Over a year ago

I really like this method due to it being concise and easy to follow. I've accepted the answer by @Tranbi due to my question asking for a comprehension, but I will definitely be comparing both this and the accepted answer further. Thank you very much for the help.

Timur Shtatland · Accepted Answer · 2022-06-29 14:42:22Z

Use list comprehension to create a list of rows from the input dictionary, after which it is straightforward:

import pandas as pd

date_pair_dict = {

    "15-02-2022 15-02-2022": ["key 1 val 1", "key 1 val 2", "key 1 val 3"],
    "15-02-2022 16-02-2022": ["key 2 val 1", "key 2 val 2", "key 2 val 3"],
    "16-02-2022 16-02-2022": ["key 3 val 1", "key 3 val 2", "key 3 val 3"],
    "16-02-2022 17-02-2022": ["key 4 val 1", "key 4 val 2", "key 4 val 3"]

}
headers = ["date pair header", "header val 1", "header val 2", "header val 3"]

date_pairs = [[k] + v for k, v in date_pair_dict.items()]
 df = pd.DataFrame(date_pairs)
df.columns = headers
print(df)
#         date pair header header val 1 header val 2 header val 3
# 0  15-02-2022 15-02-2022  key 1 val 1  key 1 val 2  key 1 val 3
# 1  15-02-2022 16-02-2022  key 2 val 1  key 2 val 2  key 2 val 3
# 2  16-02-2022 16-02-2022  key 3 val 1  key 3 val 2  key 3 val 3
# 3  16-02-2022 17-02-2022  key 4 val 1  key 4 val 2  key 4 val 3

Sebastian Wozny · Accepted Answer · 2022-06-29 14:42:23Z

0

Some fancy rename approach:

>>> pandas.DataFrame([[key,*values] for key,values in date_pair_dict.items()]).rename(columns=headers.__getitem__))

        date pair header header val 1 header val 2 header val 3
0  15-02-2022 15-02-2022  key 1 val 1  key 1 val 2  key 1 val 3
1  15-02-2022 16-02-2022  key 2 val 1  key 2 val 2  key 2 val 3
2  16-02-2022 16-02-2022  key 3 val 1  key 3 val 2  key 3 val 3
3  16-02-2022 17-02-2022  key 4 val 1  key 4 val 2  key 4 val 3

answered Jun 29, 2022 at 14:42

Sebastian Wozny

17.6k7 gold badges55 silver badges73 bronze badges

Comments

Cardstdani · Accepted Answer · 2022-06-29 14:36:44Z

-1

You can use list traversal to construct the output dictionary:

import pandas

date_pair_dict = {

    "15-02-2022 15-02-2022": ["key 1 val 1", "key 1 val 2", "key 1 val 3"],
    "15-02-2022 16-02-2022": ["key 2 val 1", "key 2 val 2", "key 2 val 3"],
    "16-02-2022 16-02-2022": ["key 3 val 1", "key 3 val 2", "key 3 val 3"],
    "16-02-2022 17-02-2022": ["key 4 val 1", "key 4 val 2", "key 4 val 3"]

}

headers = ["date pair header", "header val 1", "header val 2", "header val 3"]

e = [[k] + v for k,v in date_pair_dict.items()]
d = {}
for i in range(len(e[0])):
    d[headers[i]] = [a[i] for a in e]
    
d = pandas.DataFrame(d)

answered Jun 29, 2022 at 14:36

Cardstdani

5,2513 gold badges14 silver badges37 bronze badges

Collectives™ on Stack Overflow

Converting a dictionary of lists to a pandas.DataFrame using predefined headers

5 Answers 5

3 Comments

1 Comment

Comments

Comments

Comments

Your Answer

Hot Network Questions

Collectives™ on Stack Overflow

5 Answers 5

3 Comments

1 Comment

Comments

Comments

Comments

Your Answer

Sign up or log in

Post as a guest

Related