The pandas df is created by reading CSV file with more than 100+ columns and 2+ mil rows which looks similar to
project_code,dict_data
A001,{"a": "1", "b": "2", "c": "3"}
B001,{"a": "4", "b": "5", "c": "6"}
C001,{"b": "7", "c": "8"}
D001,{"c": "9"}
E001,{"a": "10", "c": "11"}
Using for loop & if conditions, the dict column key value data is parsed and the df looks as follow. And at times new keys get added and I have to update my if conditions to parse those keys
project_code dict_data a b c d
A001 {"a": "1", "b": "2", "c": "3"} "1" "2" "3" ""
B001 {"a": "4", "b": "5", "d": "6"} "4" "5" "" "6"
C001 {"b": "7", "c": "8"} "" "7" "8" ""
D001 {"d": "9"} "" "" "" "9"
E001 {"a": "10", "c": "11"} "10" "" "11" ""
The code takes hours to run and is not the best or efficient way for processing dict data column. Would appreciate if someone could guide or suggest me the better & efficient way.