0

I have a csv file called beers.csv and I am trying to read 4 columns from it ( brewery_name, beer_style, beer_name, beer_abv) which are all column headers. I have the following code:

import csv

csvFile = "beers.csv"
csvData = csv.reader(csvFile, delimiter=',', skipinitialspace=True) 

beer, abv, style, brewery = set(), set(), set(), set()
for row in csvData:
    beer.add(row[10])
    abv.add(row[11])
    style.add(row[7])
    brewery.add(row[1])
print(beer, abv, style, brewery) 

For some reason, I am getting an error:

Traceback (most recent call last):
  File "filter.py", line 8, in <module>
    beer.add(row[10])
IndexError: list index out of range

Here is a the top 15 lines of my beers.csv file:

brewery_id,brewery_name,review_time,review_overall,review_aroma,review_appearance,review_profilename,beer_style,review_palate,review_taste,beer_name,beer_abv,beer_beerid
10325,Vecchio Birraio,1234817823,1.5,2,2.5,stcules,Hefeweizen,1.5,1.5,Sausa Weizen,5,47986
10325,Vecchio Birraio,1235915097,3,2.5,3,stcules,English Strong Ale,3,3,Red Moon,6.2,48213
10325,Vecchio Birraio,1235916604,3,2.5,3,stcules,Foreign / Export Stout,3,3,Black Horse Black Beer,6.5,48215
10325,Vecchio Birraio,1234725145,3,3,3.5,stcules,German Pilsener,2.5,3,Sausa Pils,5,47969
1075,Caldera Brewing Company,1293735206,4,4.5,4,johnmichaelsen,American Double / Imperial IPA,4,4.5,Cauldron DIPA,7.7,64883
1075,Caldera Brewing Company,1325524659,3,3.5,3.5,oline73,Herbed / Spiced Beer,3,3.5,Caldera Ginger Beer,4.7,52159
1075,Caldera Brewing Company,1318991115,3.5,3.5,3.5,Reidrover,Herbed / Spiced Beer,4,4,Caldera Ginger Beer,4.7,52159
1075,Caldera Brewing Company,1306276018,3,2.5,3.5,alpinebryant,Herbed / Spiced Beer,2,3.5,Caldera Ginger Beer,4.7,52159
1075,Caldera Brewing Company,1290454503,4,3,3.5,LordAdmNelson,Herbed / Spiced Beer,3.5,4,Caldera Ginger Beer,4.7,52159

What am I doing wrong?

4
  • first you should use print(row) to see what you have in variable. Maybe you have shorter list then you expect. Commented Feb 18, 2020 at 2:01
  • 1
    You probably have a line on your CSV file that doesn't have data in the 11th column, beer_name. Is there an empty line at the bottom of the file? Commented Feb 18, 2020 at 2:01
  • @Matthew, @Jon and @furas, my file has more than 1 million rows, that's just the result of cat -10 beers.csv Commented Feb 18, 2020 at 2:13
  • Have you done any debugging? Have you tried identifying on which line of the CSV the error occurs, for example? Commented Feb 18, 2020 at 2:31

1 Answer 1

2
import csv

cols = ('brewery_name', 'beer_style', 'beer_name', 'beer_abv')
sets = {}
with open("beers.csv") as f:
    reader = csv.DictReader(f)
    for row in reader:
        for col in cols:
            sets.setdefault(col, set()).add(row[col])
print(sets)

$ python3 beers.py {'brewery_name': {'Caldera Brewing Company', 'Vecchio Birraio'}, 'beer_style': {'American Double / Imperial IPA', 'Foreign / Export Stout', 'Herbed / Spiced Beer', 'German Pilsener', 'Hefeweizen', 'English Strong Ale'}, 'beer_name': {'Black Horse Black Beer', 'Sausa Pils', 'Sausa Weizen', 'Cauldron DIPA', 'Caldera Ginger Beer', 'Red Moon'}, 'beer_abv': {'5', '7.7', '4.7', '6.5', '6.2'}}

Sign up to request clarification or add additional context in comments.

4 Comments

I only want the unique values in those columns hence the use of set() in my original post
@user3046782 I only want the unique values in those columns hence the use of set() in my original post That isn't what your question is about though, no?
@AMC it should be apparent and obvious from the code i posted, hence the sets, no?
@user3046782 I’m not sure what you mean, I was referring only to the focus of the question. The sets shouldn’t have any impact on the issue.

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.