I was recently re-writing some code to python3 and in search for a clean pythonic solution to decode bytes returned by urllib.request.urlopen to be passed to csv.reader
I came up with the following:
import urllib.request
def fetch(symbol='IBM'):
kwargs = { 'symbol': symbol,
'start_month': '01',
'start_day': '01',
'start_year': '2002',
'end_month': '12',
'end_day': '31',
'end_year': '2012',
}
urlstring = 'http://ichart.finance.yahoo.com/table.csv?s={symbol}&a={start_month}&b={start_day}&c={start_year}&d={end_month}&e={end_day}&f={end_year}&g=d&ignore=.csv'.format(**kwargs)
data = [row for row in csv.reader(map(bytes.decode, urllib.request.urlopen(urlstring), ('iso-8859-1' for i in iter(lambda:0,1))))]
return data
I am wondering if there is a better solution? Essentially, the url is returning a csv file and in Python 2.x I was able to just use urllib2 and pass the return value of urllib2.urlopen() to csv.reader() However, in Python 3.x we now get back bytes, so I mapped the response to bytes.decode and pass that back to csv.reader. But I am curious if there is a better way to do this or perhaps I missed something while searching for an optimal solution?
What is the proper pythonic way to handle cases like this, where the object returned needs to be decoded before we pass it to another function to be iterated over?
Edit: Thanks Ignacio!
Looking at the link you gave me I got the following solution:
data=[row for row in csv.reader(codecs.iterdecode(urllib.request.urlopen(urlstring),'iso-8859-1'))]
Which looks much cleaner!