I am encountering a strange problem with a pandas dataframe where in, where() fails complaining that it cannot join on the overlapping index names.
To reproduce this problem try below:
import yfinance as yf
from datetime import datetime
startdate=datetime(2022,12,1)
enddate=datetime(2022,12,6)
y_symbols = ['GOOG', 'AAPL', 'MSFT']
data=yf.download(y_symbols, start=startdate, end=enddate, auto_adjust=True, threads=True)
data[data['Close'] > 100]
Then the raised error looks like:
Traceback (most recent call last):
File "<stdin>", line 1, in <module>
..
File "lib/python3.9/site-packages/pandas/core/indexes/base.py", line 229, in join
join_index, lidx, ridx = meth(self, other, how=how, level=level, sort=sort)
File "lib/python3.9/site-packages/pandas/core/indexes/base.py", line 4658, in join
return self._join_multi(other, how=how)
File "lib/python3.9/site-packages/pandas/core/indexes/base.py", line 4782, in _join_multi
raise ValueError("cannot join with no overlapping index names")
ValueError: cannot join with no overlapping index names
Here, data looks like:
Close High ... Open Volume
AAPL GOOG MSFT AAPL GOOG MSFT ... AAPL GOOG MSFT AAPL GOOG MSFT
Date ...
2022-12-01 148.309998 101.279999 254.690002 149.130005 102.589996 256.119995 ... 148.210007 101.400002 253.869995 71250400 21771500 26041500
2022-12-02 147.809998 100.830002 255.020004 148.000000 101.150002 256.059998 ... 145.960007 99.370003 249.820007 65421400 18812200 21522800
2022-12-05 146.630005 99.870003 250.199997 150.919998 101.750000 253.820007 ... 147.770004 99.815002 252.009995 68826400 19955500 23435300
What could be missing here in the dataframe that this would not work?
>>> (feed_tail['Close'] > 100).shape (3, 3) >>> feed_tail.shape (3, 15)