0

I'm trying to take some data from an online CSV file and make a table from it. I use splitlines() to isolate each bit of data but I keep getting a ValueError:

ValueError: invalid literal for int() with base 10: 'Year'

Here is my code:

import csv
import urllib.request

url = "https://raw.github.com/datasets/gdp/master/data/gdp.csv"
webpage = urllib.request.urlopen(url)
datareader = csv.reader(webpage.read().decode('utf-8').splitlines())
dataList = []
NewTable = []
print('done')
for row in datareader:
    ##print(row)
    countryName, countryCode, Year, Value= row
    print(Year)
    Year = int(Year)
    ##Value = float(Value)
    rowTuple = countryName, countryCode, Year, Value
    dataList.append(rowTuple)

When I uncomment "print(Year)" I get a list of integers. All numbers between 1960-2012 and I can't figure out why it won't accept the conversion from string to integer.

Any ideas?

1 Answer 1

2

Your first row in the CSV is a header row, not a data row:

Country Name,Country Code,Year,Value

Skip it with:

datareader = csv.reader(webpage.read().decode('utf-8').splitlines())
next(datareader, None)  # skip the header

You could use the io.TextIOWrapper() object to have the webpage decoded from UTF-8 for you:

import io

webpage = urllib.request.urlopen(url)
datareader = csv.reader(io.TextIOWrapper(webpage, 'utf-8'))
next(datareader, None)  # skip the header
Sign up to request clarification or add additional context in comments.

1 Comment

You could suggest using DictReader too.

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.