Pandas array from url - only NaN values

Question

I'm encountering a little problem when trying to download data from an url and putting it into an array using pandas:

When I simply do the following:

EUR_USD_TICK = pd.read_csv('https://tickdata.fxcorporate.com/EURUSD/2015/1.csv.gz')

I get this error:

CParserError: Error tokenizing data. C error: Expected 2 fields in line 14, saw 3

Which I assume is due to the fact that the data retrieved from the url is compressed.

So I tried to set the compression to gzip:

EUR_USD_TICK = pd.read_csv('https://tickdata.fxcorporate.com/EURUSD/2015/1.csv.gz',compression='gzip')

When doing that, I get no error message, however, the array is only composed by NaN values:

          D  Unnamed: 1  Unnamed: 2
0       NaN         NaN         NaN
1       NaN         NaN         NaN
2       NaN         NaN         NaN
3       NaN         NaN         NaN
4       NaN         NaN         NaN

The length of the array matches the length of the csv file, so I have no idea why the array is filed with NaN values.

Would anyone have an idea of why I'm getting this issue?

Thanks

Ps: The csv data that I'm trying to download:

DateTime,Bid,Ask
01/04/2015 22:00:00.389,1.19548,1.19557
01/04/2015 22:00:00.406,1.19544,1.19556
01/04/2015 22:00:00.542,1.19539,1.19556
01/04/2015 22:00:00.566,1.19544,1.19556

well i dont think you have to worry about compression because when i try and download tickdata.fxcorporate.com/EURUSD/2015/1.csv it returns a file, but i can't find an encoding that works on it. — Dylan
– Dylan, Commented Oct 25, 2017 at 16:11

Nathan H · Accepted Answer · 2017-10-25 17:37:58Z

2

It seems that there are some characters in the file that Python/pandas doesn't like ('\x00').

So I used the gzip module to manually read the file and then removed those characters. After that, Pandas reads the file without issues.

import pandas as pd
import requests
from io import StringIO,BytesIO
import gzip

r = requests.get('https://tickdata.fxcorporate.com/EURUSD/2015/1.csv.gz')

gzip_file = gzip.GzipFile(fileobj=BytesIO(r.content))

pd.read_csv(StringIO(gzip_file.read().decode('utf-8').replace('\x00','')))

answered Oct 25, 2017 at 17:37

Nathan H

3461 silver badge10 bronze badges

Sign up to request clarification or add additional context in comments.

Collectives™ on Stack Overflow

Pandas array from url - only NaN values

1 Answer 1

Comments

Your Answer

Hot Network Questions

Collectives™ on Stack Overflow

1 Answer 1

Comments

Your Answer

Sign up or log in

Post as a guest

Related