1

I have 2 dataframes:

df.head()
Out[2]: 
   Unnamed: 0 Symbol        Date      Close
0        4061      A  2016-01-13  36.515889
1        4062      A  2016-01-14  36.351784
2        4063      A  2016-01-15  36.351784
3        4064      A  2016-01-19  36.590483
4        4065      A  2016-01-20  35.934062

and

dfw.head()
Out[3]: 
         Symbol       Weight
0  A              (0.000002)
1  AA              0.000112 
2  AAC            (0.000004)
3  AAL             0.000006 
4  AAMC            0.000002 

ISSUE: Not every symbol if df will have a weight in dfw. If it does not I want to drop it from my new dataframe (all dates of it). If the symbol is in dfw I want to merge the weight in with df so that each row has symbol, date, close and weight. I have tried the following but get NaN values. I also am not sure how to remove all symbols with no weights even if I was successful.

dfall = df.merge(dfw, on='Symbol', how='left')

dfall.head()
Out[14]: 
   Unnamed: 0 Symbol        Date      Close Weight
0        4061      A  2016-01-13  36.515889    NaN
1        4062      A  2016-01-14  36.351784    NaN
2        4063      A  2016-01-15  36.351784    NaN
3        4064      A  2016-01-19  36.590483    NaN
4        4065      A  2016-01-20  35.934062    NaN

1 Answer 1

3
df_all = df[df.Symbol.isin(dfw.Symbol.unique())].merge(dfw, how='left', on='Symbol')

I am not sure why you are getting NaN values. Perhaps you have spaces in you your symbols? You can clean them via: dfw['Symbol'] = dfw.Symbol.str.strip() You would need to do the same for df.

>>> df_all
Unnamed: 0 Symbol        Date      Close      Weight
   0  4061      A  2016-01-13  36.515889  (0.000002)
   1  4062      A  2016-01-14  36.351784  (0.000002)
   2  4063      A  2016-01-15  36.351784  (0.000002)
   3  4064      A  2016-01-19  36.590483  (0.000002)
   4  4065      A  2016-01-20  35.934062  (0.000002)
Sign up to request clarification or add additional context in comments.

2 Comments

i needed to add str. in front of strip(), df['Symbol'] = dfw.Symbol.str.strip() dfw['Symbol'] = dfw.Symbol.str.strip()
actually, close, but, this gives me one occurrence for each symbol, I need to have the weight added to the symbol for each date.

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.