So, I'm trying to develop a personal stock screening tool, however I keep getting the "year is out of range" error while attempting to convert a column of timestamps into a readable datetime format... I'll be iterating this code over thousands of CSVs. In theory I can deal with this date issue at a later time, but the fact that I can't get it working now is quite annoying.
The code submitted below is the majority of the function I'm working with. It will navigate to the file location, check that the file isn't empty, then begin working on it.
I'm sure there are more elegant ways to navigate to the directory and grab the intended files, but I'm currently only concerned with the inability to convert the timestamps.
I've seen solutions to this issue when the timestamps were in a series, ie;
dates =['1449866579','1449866580','1449866699'...]
I can't seem to get the solution to work on a dataframe.
This is a sample of the CSV file:
1449866579,113.2100,113.2700,113.1600,113.2550,92800
1449866580,113.1312,113.2200,113.0700,113.2200,135800
1449866699,113.1150,113.1500,113.0668,113.1300,106000
1449866700,113.1800,113.2000,113.1200,113.1200,125800
1449866764,113.1200,113.1800,113.0700,113.1490,130900
1449866821,113.0510,113.1223,113.0500,113.1200,110400
1449866884,113.1000,113.1400,113.0100,113.0800,388000
1449866999,113.0900,113.1200,113.0700,113.0900,116700
1449867000,113.2000,113.2100,113.0770,113.1000,191500
1449867119,113.2250,113.2300,113.1400,113.2000,114400
1449867120,113.1300,113.2500,113.1000,113.2300,146700
1449867239,113.1300,113.1800,113.1250,113.1300,108300
1449867299,113.0930,113.1300,113.0700,113.1300,166600
1449867304,113.0850,113.1100,113.0300,113.1000,167000
1449867360,113.0300,113.1100,113.0200,113.0800,204300
1449867479,113.0700,113.0800,113.0200,113.0300,197100
1449867480,113.1600,113.1700,113.0500,113.0700,270200
1449867540,113.1700,113.2900,113.1300,113.1500,3882400
1449867600,113.1800,113.1800,113.1800,113.1800,3500
import pandas as pd
import numpy as np
import matplotlib.pyplot as plt
from datetime import datetime
import time
import os
def analysis():
try:
os.chdir(training_1d)
for i in os.listdir(os.getcwd()):
if i.endswith('.txt'):
if os.stat(i).st_size > 0:
print i+" is good for analysis..."
try:
df = pd.read_csv(i, header=None, names=['date', 'open', 'high', 'low', 'close', 'volume'])
print df.head()
print df.columns
df['date'] = pd.to_datetime(df['date'],unit='s')
print df.head()
except Exception, e:
print str(e),"Analysis Failed..."
elif os.stat(i).st_size == 0:
print i+" is an empty file"
continue
except Exception, e:
print str(e),"Something went wrong here...check: "+sys.last_traceback.tb_lineno
Here's the output error...
AAPL.txt is good for analysis...
date open high low close volume
0 1449865921 113.090 113.180 113.090 113.1601 89300
1 1449865985 113.080 113.110 113.030 113.0900 73100
2 1449866041 113.250 113.280 113.050 113.0900 101800
3 1449866100 113.240 113.305 113.205 113.2400 199900
4 1449866219 113.255 113.300 113.190 113.2500 96700
Index([u'date', u'open', u'high', u'low', u'close', u'volume'], dtype='object')
year is out of range Analysis Failed...
Any help is greatly appreciated... Thank you.
Thanks to EdChum, as noted in the comments, the following replacement provides the necessary relief:
Replacing:
df['date'] = pd.to_datetime(df['date'],unit='s')
With:
df['date'] = pd.to_datetime(df['date'].astype(int), unit='s')
date? as for mepd.to_datetime(df['date'], unit='s')works what is your pandas and numpy version?df['date'] = pd.to_datetime(df['date'].astype(int), unit='s')