Method1
def getAndBuildDatafrmeFromCsvBasic(filename):
colTypes = {'Open': 'float64', 'High': 'float64', 'Low': 'float64', 'Close': 'float64', 'Volume': 'float64'}
dfEurUsd2017 = pd.read_csv(filename, delimiter=",", index_col='Gmt time', dtype=colTypes, parse_dates=['Gmt time'])
return dfEurUsd2017
Method2
def getAndBuildDatafrmeFromCsv(filename):
df = pd.read_csv(filename, header=None)
df.columns = ['date', 'Open', 'High', 'Low', 'Close', 'Volume']
df.date = pd.to_datetime(df.date, format='%d.%m.%Y %H:%M:%S.%f')
df.index = df['date']
df = df[['Open', 'High', 'Low', 'Close', 'Volume']]
return df
Results Method1
Open High Low Close Volume
Gmt time
2017-12-04 23:00:00 1.06672 1.06699 1.06636 1.06698 1889.56
Results Method2
Open High Low Close Volume
Gmt time
2017-12-04 23:00:00 1.18686 1.18699 1.18666 1.18682 2004.46
Why does method1 incorrectly parse the values for Open,High,Low,Close,Volume? Method2 produces the correct output. I'm concerned why both methods output completely different numerical values, even the volume is different. Yet the csv file is the same.
Rows from CSV
04.12.2017 23:00:00.000,1.18686,1.18699,1.18666,1.18682,2004.4599999999998
04.12.2017 23:30:00.000,1.18682,1.18706,1.18652,1.18681,1242.68
05.12.2017 00:00:00.000,1.18681,1.18691,1.18639,1.18653,2666.81
05.12.2017 00:30:00.000,1.18653,1.18726,1.18650,1.18709,3567.2400000000007
05.12.2017 01:00:00.000,1.18708,1.18750,1.18707,1.18738,3105.4699999999993
05.12.2017 01:30:00.000,1.18738,1.18744,1.18691,1.18732,3561.5
05.12.2017 02:00:00.000,1.18732,1.18766,1.18704,1.18740,2706.6400000000003