0

I'm trying to save a pandas DataFrame as an excel file and import it again and convert it back to a dictionary. The data frame is quite large in size. For instance, consider the following code:

import pandas as pd

path = 'file.xlsx'
dict1 = {'a' : [3, [1, 2, 3], 'text1'],
         'b' : [4, [4, 5, 6, 7], 'text2']}
print('\n\nType 1:', type(dict1['a'][1]))

df1 = pd.DataFrame(dict1)
df1.to_excel(path, sheet_name='Sheet1')
print("\n\nSaved df:\n", df1 , '\n\n')

df2 = pd.read_excel(path, sheet_name='Sheet1')
print("\n\nLoaded df:\n", df2 , '\n\n')

dict2 = df2.to_dict(orient='list')
print("New dict:", dict2, '\n\n')
print('Type 2:', type(dict2['a'][1]))

The output is:

Type 1: <class 'list'>


Saved df:
            a             b
0          3             4
1  [1, 2, 3]  [4, 5, 6, 7]
2      text1         text2




Loaded df:
            a             b
0          3             4
1  [1, 2, 3]  [4, 5, 6, 7]
2      text1         text2


New dict: {'a': [3, '[1, 2, 3]', 'text1'], 'b': [4, '[4, 5, 6, 7]', 'text2']}


Type 2: <class 'str'>

Could you help me get back the original dictionary with the same element types? Thank you!

1 Answer 1

1

Now, there is an option with read_excel which allows us to change the dtype of the columns as they're read in, however there is no such option to change the dtype of any of the rows. So, we have to do the type conversion ourselves, after the data has been read in.

As you've shown in your question, df['a'][1] has type str, but you'd like it to have type list.

So, let's say we have some string l ='[1, 2, 3]' we could convert it to a list of ints (l=[1, 2, 3]) as [int(val) for val in l.strip('[]').split(',')]. Now, we can use this in conjunction with the .apply method to get what we desire:

df.iloc[1] = df.iloc[1].apply(lambda x : [int(val) for val in x.strip('[]').split(',')])

Putting this example back together we have:

import pandas as pd

# Data as read in by read_excel method
df2 = pd.DataFrame({'a' : [3, '[1, 2, 3]', 'text1'],
                   'b' : [4, '[4, 5, 6, 7]', 'text2']})
print('Type: ', type(df2['a'][1]))
#Type:  <class 'str'>

# Convert strings in row 1 to lists
df2.iloc[1] = df2.iloc[1].apply(lambda x : [int(val) for val in x.strip('[]').split(',')])

print('Type: ', type(df2['a'][1]))
#Type:  <class 'list'>

dict2 = df2.to_dict(orient='list')
Sign up to request clarification or add additional context in comments.

Comments

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.