0

I have a pandas Dataframe p_df like this

        date_loc        timestamp  
id                                                                    
1       2017-05-29  1496083649   
2       2017-05-29  1496089320   
3       2017-05-29  1496095148   
4       2017-05-30  1496100936   
...

and a dict like this one

observations = {
   '1496089320': {
       'col_a: 'value_a',
       'col_b: 'value_b',
       'col_c: 'n/a'
   },
   '1496100936' : {
       'col_b: 'value_b'
   },
   ...
}

I'd like to add all the values contained inside the observations sub-dict with their respective keys as the column name when the keys in the dict also exist in the timestamp columns, so that the resulting dataframe is

        date_loc     timestamp     col_a    col_b   col_c
id                                                                    
1       2017-05-29  1496083649   
2       2017-05-29  1496089320   value_a  value_b     n/a
3       2017-05-29  1496095148   
4       2017-05-30  1496100936            value_b
...

I tried with several methods (agg(), apply(), iterrows()) but nothing works yet. Here's for example my last attempt

p_df['col_a'] = ''
p_df['col_b'] = ''
p_df['col_c'] = ''

for index, row in p_df.iterrows():
    ts  = p_df.loc[index, 'timestamp']
    if ts in observations:
        # how to concat column values in this row?
    # end if
#end for

probably I feel there's also a better approach than iterating rows of the dataframe, so I'm open to better alternatives than this.

1 Answer 1

1

You might construct a data frame from the dictionary and then merge with the original data frame on the timestamp column:

import pandas as pd
# make sure the timestamp columns are of the same type
df.timestamp = df.timestamp.astype(str)
​
df.merge(pd.DataFrame.from_dict(observations, 'index'), 
         left_on='timestamp', right_index=True, how='left').fillna('')

#     date_loc   timestamp   col_b  col_c   col_a
#id                 
#1  2017-05-29  1496083649          
#2  2017-05-29  1496089320  value_b n/a value_a
#3  2017-05-29  1496095148          
#4  2017-05-30  1496100936  value_b     
Sign up to request clarification or add additional context in comments.

2 Comments

it almost works, thank you but 1) with fillna() I have this error: raise AssertionError("Gaps in blk ref_locs"), without it works: 2) in my dict I have a lot keys not contained inside the dataframe so the merge gives me a lot of empty rows
Sorry, didn't read your question very carefully. Looks like you need a left instead of full join; Not sure about the fillna() issue though. I haven't come across an error with fillna like this before.

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.