0

I am using pandas and I have two dataframes which have the same number of rows:

All I want to do is to merge them together by ticker, which I tried with

a_no_nan2_last.join(ref, on = "ticker")

which gave me this error

ValueError: You are trying to merge on object and int64 columns. If you wish to proceed you should use pd.concat

So I checked the types of the column ticker in both and I checked that ticker is the only column in overlaps

set(a_no_nan2_last.columns).intersection(set(res.columns))

which gives

{'ticker'}

and

type(a_no_nan2_last["ticker"].iloc[0])
<class 'str'>
type(res["ticker"].iloc[0])
<class 'str'>

the data looks like

a_no_nan2_last
       ticker       Date   Close  Close_lead1
2141      1AD 2018-08-14  0.2845        0.160
5354      1AG 2018-08-14  0.0450        0.036
75158     1AL 2018-08-14  0.9287        0.800
188       1ST 2018-08-14  0.0370        0.079
81195     3DP 2018-08-14  0.0320        0.041
       ...        ...     ...          ...
111688    ZMI 2018-08-14  0.1200        0.095
111310    ZNC 2018-08-14  0.1650        0.078
111877    ZNO 2018-08-13  0.1150        0.079
111373    ZTA 2018-08-10  0.0700        0.070
111940    ZYB 2018-08-09  0.0070        0.014
[1792 rows x 4 columns]

and

res
variable ticker  ...  Close__variance_larger_than_standard_deviation
0           1AD  ...                                             0.0
1           1AG  ...                                             0.0
2           1AL  ...                                             0.0
3           1ST  ...                                             0.0
4           3DP  ...                                             0.0
         ...  ...                                             ...
1787        ZMI  ...                                             0.0
1788        ZNC  ...                                             0.0
1789        ZNO  ...                                             0.0
1790        ZTA  ...                                             0.0
1791        ZYB  ...                                             0.0
[1792 rows x 795 columns]

so I tried concat, which appends by row not by column giving me the wrong result

pd.concat([a_no_nan2_last, res], axis=1)
       ticker  ... Close__variance_larger_than_standard_deviation
0         NaN  ...                                            0.0
1         NaN  ...                                            0.0
2         NaN  ...                                            0.0
3         NaN  ...                                            0.0
4         NaN  ...                                            0.0
       ...  ...                                            ...
111688    ZMI  ...                                            NaN
111751    Z1P  ...                                            NaN
111814    ZIP  ...                                            NaN
111877    ZNO  ...                                            NaN
111940    ZYB  ...                                            NaN
[3556 rows x 799 columns]

I expected my output to be 1792 rows long.

Dropping the common column ticker also doesn't help, see

res3 = res.drop("ticker", axis=1)
pd.concat([a_no_nan2_last, res3], axis=1)

       ticker  ... Close__variance_larger_than_standard_deviation
0         NaN  ...                                            0.0
1         NaN  ...                                            0.0
2         NaN  ...                                            0.0
3         NaN  ...                                            0.0
4         NaN  ...                                            0.0
       ...  ...                                            ...
111688    ZMI  ...                                            NaN
111751    Z1P  ...                                            NaN
111814    ZIP  ...                                            NaN
111877    ZNO  ...                                            NaN
111940    ZYB  ...                                            NaN
[3556 rows x 798 columns]
1

1 Answer 1

0

You need to reset_index(drop=True) e.g.

res1 = pd.concat([a_no_nan2_last.reset_index(drop=True), res.reset_index(drop=True)], axis=1)
Sign up to request clarification or add additional context in comments.

Comments

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.