Add columns on a pandas DataFrame with data inside a dictionary

Question

I have a pandas Dataframe p_df like this

        date_loc        timestamp  
id                                                                    
1       2017-05-29  1496083649   
2       2017-05-29  1496089320   
3       2017-05-29  1496095148   
4       2017-05-30  1496100936   
...

and a dict like this one

observations = {
   '1496089320': {
       'col_a: 'value_a',
       'col_b: 'value_b',
       'col_c: 'n/a'
   },
   '1496100936' : {
       'col_b: 'value_b'
   },
   ...
}

I'd like to add all the values contained inside the observations sub-dict with their respective keys as the column name when the keys in the dict also exist in the timestamp columns, so that the resulting dataframe is

        date_loc     timestamp     col_a    col_b   col_c
id                                                                    
1       2017-05-29  1496083649   
2       2017-05-29  1496089320   value_a  value_b     n/a
3       2017-05-29  1496095148   
4       2017-05-30  1496100936            value_b
...

I tried with several methods (agg(), apply(), iterrows()) but nothing works yet. Here's for example my last attempt

p_df['col_a'] = ''
p_df['col_b'] = ''
p_df['col_c'] = ''

for index, row in p_df.iterrows():
    ts  = p_df.loc[index, 'timestamp']
    if ts in observations:
        # how to concat column values in this row?
    # end if
#end for

probably I feel there's also a better approach than iterating rows of the dataframe, so I'm open to better alternatives than this.

akuiper · Accepted Answer · 2017-05-29 16:15:42Z

1

You might construct a data frame from the dictionary and then merge with the original data frame on the timestamp column:

import pandas as pd
# make sure the timestamp columns are of the same type
df.timestamp = df.timestamp.astype(str)

df.merge(pd.DataFrame.from_dict(observations, 'index'), 
         left_on='timestamp', right_index=True, how='left').fillna('')

#     date_loc   timestamp   col_b  col_c   col_a
#id                 
#1  2017-05-29  1496083649          
#2  2017-05-29  1496089320  value_b n/a value_a
#3  2017-05-29  1496095148          
#4  2017-05-30  1496100936  value_b

edited May 29, 2017 at 16:15

answered May 29, 2017 at 16:00

akuiper

216k33 gold badges362 silver badges379 bronze badges

Sign up to request clarification or add additional context in comments.

2 Comments

Fabrizio Calderan Over a year ago

it almost works, thank you but 1) with fillna() I have this error: raise AssertionError("Gaps in blk ref_locs"), without it works: 2) in my dict I have a lot keys not contained inside the dataframe so the merge gives me a lot of empty rows

akuiper Over a year ago

Sorry, didn't read your question very carefully. Looks like you need a left instead of full join; Not sure about the fillna() issue though. I haven't come across an error with fillna like this before.

Collectives™ on Stack Overflow

Add columns on a pandas DataFrame with data inside a dictionary

1 Answer 1

2 Comments

Your Answer

Hot Network Questions

Collectives™ on Stack Overflow

1 Answer 1

2 Comments

Your Answer

Sign up or log in

Post as a guest

Related