Let me first start by saying I have gone through and done my due diligence trying to find a solution based on questions previously asked on the web.
I've run into an odd bug in my code that I really cannot explain... So far my code executes the following:
take stock symbols and write OHLC data to a CSV file
loop through the directory that contains the CSV files and use that data to calculate technical indicators
add the technical indicator data to the same CSV file
So the bug is that it executes everything perfectly (99 stocks) EXCEPT for ZM.csv (Zoom). The error that it prints is"
pandas.errors.EmptyDataError: No columns to parse from file.
So to troubleshoot I copied and pasted the data from ZM.csv into a CSV that I know ran fine (I used AAPL) and it actually executed fine. Next, I took the working data from AAPL.csv, pasted it into ZM.csv and ran it again. It throws the same error. I also tried renaming the file to ZMI (randomly) and it worked.
This led me to believe that for some unknown reason that the FILENAME is the root issue. The part where I first create the CSV files, I changed the name of the file to be {symbol}1.csv, {symbol}_.csv, and {symbol}I.csv to no avail. Lastly, I combined the two files together and did not mess with anything else. It worked. Does anyone know why?
The flow is to first run bars.py, check the data/ohlc/ directory CSV files (should only have the OHLC data), run technical_analysis.py, and then check the CSV files again (now with technical indicators).
[bar.py]
from config import *
from datetime import datetime
import requests, json
holdings = open('data/qqq.csv').readlines()
symbols_list = [holding.split(',')[2].strip() for holding in holdings][1:]
symbols = ','.join(symbols_list)
minute_bars_url = '{}/1Min?symbols={}&limit=100'.format(BARS_URL, symbols)
r = requests.get(minute_bars_url, headers=HEADERS)
ohlc_data = r.json()
for symbol in ohlc_data:
filename = 'data/ohlc/{}.csv'.format(symbol)
f = open(filename, 'w+')
f.write('Timestamp,Open,High,Low,Close,Volume\n')
for bar in ohlc_data[symbol]:
t = datetime.fromtimestamp(bar['t'])
timestamp = t.strftime('%I:%M:%S%p-%Z%Y-%m-%d')
line = '{},{},{},{},{},{}\n'.format(timestamp, bar['o'], bar['h'],
bar['l'], bar['c'], bar['v'])
f.write(line)
The variables symbols_list and symbols print as follows:
symbols_list = ['AAPL', 'MSFT', 'AMZN', 'FB', 'GOOGL', 'GOOG', 'TSLA', 'NVDA', 'PYPL', 'ADBE', 'INTC', 'NFLX', 'CMCSA', 'PEP', 'COST', 'CSCO', 'AVGO', 'QCOM', 'TMUS', 'AMGN', 'TXN', 'CHTR', 'SBUX', 'ZM', 'AMD', 'INTU', 'ISRG', 'MDLZ', 'JD', 'GILD', 'BKNGLD', 'BKNG', 'FISV', 'MELI', 'ATVI', 'ADP', 'CSX', 'REGN', 'MU', 'AMAT', 'ADSK', 'VRTX', 'LRCX', 'ILMN', 'ADI', 'BIIB', 'MNST', 'EXC', 'KDP', 'LULU', 'DOCU', 'WDAY', 'CTSH', 'KHC', 'NXPI', 'BIDU', 'XEL', 'DXCM', 'EBAY', 'EA', 'ID', 'SNPS',XX', 'CTAS', 'SNPS', 'ORLY', 'SGEN', 'SPLK', 'ROST', 'WBA', 'KLAC', 'NTES', 'PCAR', 'CDNS', 'MAR', 'VRSK', 'PAYX', 'ASML', 'ANSS', 'MCHP', 'XLNX', 'MRNA', 'CPRT', 'ALGN', 'PDD', 'ALXN', 'SIRI', 'FAST', 'SWKS', 'VRSN', 'DLTR', 'CE 'TTWO', 'RN', 'MXIM', 'INCY', 'TTWO', 'CDW', 'CHKP', 'CTXS', 'TCOM', 'BMRN', 'ULTA', 'EXPE', 'FOXA', 'LBTYK', 'FOX', 'LBTYA']
symbols = AAPL,MSFT,AMZN,FB,GOOGL,GOOG,TSLA,NVDA,PYPL,ADBE,INTC,NFLX,CMCSA,PEP,COST,CSCO,AVGO,QCOM,TMUS,AMGN,TXN,CHTR,SBUX,ZM,AMD,INTU,ISRG,MDLZ,JD,GILD,BKNG,FISV,MELI,ATVI,ADP,CSX,REGN,MU,AMAT,ADSK,VRTX,LRCX,ILMN,ADI,BIIB,MNST,EXC,KDP,LULU,DOCU,WDAU,DOCU,WDAY,CTSH,KHC,NXPI,BIDU,XEL,DXCM,EBAY,EA,IDXX,CTAS,SNPS,ORLY,SGEN,SPLK,ROST,WBA,KLAC,NTES,PCAR,CDNS,MAR,VRSK,PAYX,ASML,ANSS,MCHP,XLNX,MRNA,CPRT,ALGN,PDD,ALXN,SIRI,FAST,SWKS,VRSN,DLTR,CERN,MXIM,INCY,TTWO,CDW,CHKP,CTXS,TCOM,EXPE,FOXA,BMRN,ULTA,EXPE,FOXA,LBTYK,FOX,LBTYA
So ZM is not listed last.
[technical_analysis.py]
import btalib
import pandas as pd
from datetime import datetime
from bars import ohlc_data
from bars import symbols_list as symbols
for symbol in symbols:
try:
file_path = f'data/ohlc/{symbol}.csv'
dataframe = pd.read_csv(file_path,
parse_dates=True,
index_col='Timestamp')
sma6 = btalib.sma(dataframe, period=6)
sma10 = btalib.sma(dataframe, period=10)
rsi = btalib.rsi(dataframe)
macd = btalib.macd(dataframe)
dataframe['SMA-6'] = sma6.df
dataframe['SMA-10'] = sma10.df
dataframe['RSI'] = rsi.df
dataframe['MACD'] = macd.df['macd']
dataframe['Signal'] = macd.df['signal']
dataframe['Histogram'] = macd.df['histogram']
f = open(file_path, 'w+')
dataframe.to_csv(file_path, sep=',', index=True)
except:
print(f'{symbol} is not writing the technical data.')
ZMthe last item indata/qqq.csv? If you add a bogus symol at the end, does ZM read successfully?pd.read_csv(); please remove the lines of code after thepd.read_csvand reduce your code to minimal reproducible example, examples on SO are required to be Minimal.pd.read_csv()fails on 'ZM.csv', then just chop your example down to that, and show us its first few lines, perhaps the header or data are malformed. Absolute minimal lines of code to reproduce that. Also, a debugging tip is you can do a Pythonassertafterread_csvthat the dataframe or its columns have the expected number of rows/columns; that will cause an immediate exception if they don't.