4

I have a dictionary that looks like the following:

date_pair_dict = {

    "15-02-2022 15-02-2022": ["key 1 val 1", "key 1 val 2", "key 1 val 3"],
    "15-02-2022 16-02-2022": ["key 2 val 1", "key 2 val 2", "key 2 val 3"],
    "16-02-2022 16-02-2022": ["key 3 val 1", "key 3 val 2", "key 3 val 3"],
    "16-02-2022 17-02-2022": ["key 4 val 1", "key 4 val 2", "key 4 val 3"]

}

And a list of headers:

headers = ["date pair header", "header val 1", "header val 2", "header val 3"]

I would like to create a pandas.DataFrame and write this to Excel, where the format would be the following expected output:

date pair header header val 1 header val 2 header val 3
15-02-2022 15-02-2022 key 1 val 1 key 1 val 2 key 1 val 3
15-02-2022 16-02-2022 key 2 val 1 key 2 val 2 key 2 val 3
16-02-2022 16-02-2022 key 3 val 1 key 3 val 2 key 3 val 3
16-02-2022 17-02-2022 key 4 val 1 key 4 val 2 key 4 val 3

Right now, I'm using this (arguably very sad) method:

import pandas

date_pair_dict = {

    "15-02-2022 15-02-2022": ["key 1 val 1", "key 1 val 2", "key 1 val 3"],
    "15-02-2022 16-02-2022": ["key 2 val 1", "key 2 val 2", "key 2 val 3"],
    "16-02-2022 16-02-2022": ["key 3 val 1", "key 3 val 2", "key 3 val 3"],
    "16-02-2022 17-02-2022": ["key 4 val 1", "key 4 val 2", "key 4 val 3"]

}

headers = ["date pair header", "header val 1", "header val 2", "header val 3"]

list_of_keys, list_of_val_1, list_of_val_2, list_of_val_3 = [], [], [], []

for key in date_pair_dict.keys():

    list_of_keys.append(key)

    val_1, val_2, val_3 = date_pair_dict.get(key)

    list_of_val_1.append(val_1)
    list_of_val_2.append(val_2)
    list_of_val_3.append(val_3)

dataframe = pandas.DataFrame(
    {
        headers[0]: list_of_keys,
        headers[1]: list_of_val_1,
        headers[2]: list_of_val_2,
        headers[3]: list_of_val_3,
    }
)

Which is not scalable whatsoever. In reality, this date_pair_dict can have any number of keys, each corresponding to a list of any length. The length of these lists will however always remain the same, and will be known beforehand (I will always predefine the headers list).

Additionally, I believe this runs the risk of me having a dataframe that does not share the same order as the original keys, due to me doing the following:

for key in dictionary.keys():

    ....

The keys are date pairs, and need to remain in order when used as the first column of the dataframe.

Is there a better way to do this, preferably using a dictionary comprehension?

5 Answers 5

3

Like you said you can use a comprehension on your dict key/value pairs:

import pandas as pd

date_pair_dict = {

    "15-02-2022 15-02-2022": ["key 1 val 1", "key 1 val 2", "key 1 val 3"],
    "15-02-2022 16-02-2022": ["key 2 val 1", "key 2 val 2", "key 2 val 3"],
    "16-02-2022 16-02-2022": ["key 3 val 1", "key 3 val 2", "key 3 val 3"],
    "16-02-2022 17-02-2022": ["key 4 val 1", "key 4 val 2", "key 4 val 3"]

}

headers = ["date pair header", "header val 1", "header val 2", "header val 3"]

df = pd.DataFrame([[k] + v for k,v in date_pair_dict.items()], columns=headers)
print(df)

Output:

        date pair header header val 1 header val 2 header val 3
0  15-02-2022 15-02-2022  key 1 val 1  key 1 val 2  key 1 val 3
1  15-02-2022 16-02-2022  key 2 val 1  key 2 val 2  key 2 val 3
2  16-02-2022 16-02-2022  key 3 val 1  key 3 val 2  key 3 val 3
3  16-02-2022 17-02-2022  key 4 val 1  key 4 val 2  key 4 val 3
Sign up to request clarification or add additional context in comments.

3 Comments

I don't use dictionaries often, will using .items() in this way maintain the order of the keys and their values (as defined in my date_pair_dict) when converting it to a dataframe?
Depends on the version of Python that is used. You should always reorder the resulting dataframe explicitly.
Understood, thank you for the helping hand. Maintaining the order of the keys is a secondary problem in any case, and should likely be dealt with as such. I wasn't at all aware that different versions of Python treated dicts differently.
1

IIUC, you can simply do:

df = pd.DataFrame.from_dict(date_pair_dict, orient='index').reset_index()
df.columns = headers

It's always a good idea to exploit existing methods before trying to come up with your own constructor.

1 Comment

I really like this method due to it being concise and easy to follow. I've accepted the answer by @Tranbi due to my question asking for a comprehension, but I will definitely be comparing both this and the accepted answer further. Thank you very much for the help.
0

Use list comprehension to create a list of rows from the input dictionary, after which it is straightforward:

import pandas as pd

date_pair_dict = {

    "15-02-2022 15-02-2022": ["key 1 val 1", "key 1 val 2", "key 1 val 3"],
    "15-02-2022 16-02-2022": ["key 2 val 1", "key 2 val 2", "key 2 val 3"],
    "16-02-2022 16-02-2022": ["key 3 val 1", "key 3 val 2", "key 3 val 3"],
    "16-02-2022 17-02-2022": ["key 4 val 1", "key 4 val 2", "key 4 val 3"]

}
headers = ["date pair header", "header val 1", "header val 2", "header val 3"]

date_pairs = [[k] + v for k, v in date_pair_dict.items()]
 df = pd.DataFrame(date_pairs)
df.columns = headers
print(df)
#         date pair header header val 1 header val 2 header val 3
# 0  15-02-2022 15-02-2022  key 1 val 1  key 1 val 2  key 1 val 3
# 1  15-02-2022 16-02-2022  key 2 val 1  key 2 val 2  key 2 val 3
# 2  16-02-2022 16-02-2022  key 3 val 1  key 3 val 2  key 3 val 3
# 3  16-02-2022 17-02-2022  key 4 val 1  key 4 val 2  key 4 val 3

Comments

0

Some fancy rename approach:

>>> pandas.DataFrame([[key,*values] for key,values in date_pair_dict.items()]).rename(columns=headers.__getitem__))

        date pair header header val 1 header val 2 header val 3
0  15-02-2022 15-02-2022  key 1 val 1  key 1 val 2  key 1 val 3
1  15-02-2022 16-02-2022  key 2 val 1  key 2 val 2  key 2 val 3
2  16-02-2022 16-02-2022  key 3 val 1  key 3 val 2  key 3 val 3
3  16-02-2022 17-02-2022  key 4 val 1  key 4 val 2  key 4 val 3

Comments

-1

You can use list traversal to construct the output dictionary:

import pandas

date_pair_dict = {

    "15-02-2022 15-02-2022": ["key 1 val 1", "key 1 val 2", "key 1 val 3"],
    "15-02-2022 16-02-2022": ["key 2 val 1", "key 2 val 2", "key 2 val 3"],
    "16-02-2022 16-02-2022": ["key 3 val 1", "key 3 val 2", "key 3 val 3"],
    "16-02-2022 17-02-2022": ["key 4 val 1", "key 4 val 2", "key 4 val 3"]

}

headers = ["date pair header", "header val 1", "header val 2", "header val 3"]

e = [[k] + v for k,v in date_pair_dict.items()]
d = {}
for i in range(len(e[0])):
    d[headers[i]] = [a[i] for a in e]
    
d = pandas.DataFrame(d)

Comments

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.