I am trying to transform lists from one dataframe column into rows but not sure how to efficiently do that in python?
My actual data has thousands of rows and lists of variable lengths (in column Specs but to simplify, I will use the example below.
import pandas as pd
data = [{'Type': 'A', 'Specs': [['a1', 50], ['a2', 14]]},
{'Type': 'B', 'Specs': [['b1', 20], ['b2', 25], ['b3', 15], ['b4', 10]]},
{'Type': 'C', 'Specs': [['c1', 32]]} ]
df = pd.DataFrame(data)
The final result should be equivalent to the output from the dataframe below
data_out= [{'Type': 'A', 'model':'a1', 'qty': 50},
{'Type': 'A', 'model':'a2', 'qty': 14},
{'Type': 'B', 'model':'b1', 'qty': 20},
{'Type': 'B', 'model':'b2', 'qty': 25},
{'Type': 'B', 'model':'b3', 'qty': 15},
{'Type': 'B', 'model':'b4', 'qty': 10},
{'Type': 'C', 'model':'c1', 'qty': 32}]
df_out = pd.DataFrame(data_out)
I have tried to use apply with a function to convert each row list/value to a dataframe and getting confused how to return a dataframe for each row and expand the new dataframe with the new rows. Please let me know if I am on the wrong track and what would be the most efficient way to get the required dataframe output on large data? Thanks
def convert_list(my_list):
my_df = pd.DataFrame(pv_list, columns=['model', 'qty'])
return my_df
df[['model', 'qty']] = df['Specs'].apply(convert_list)