How to read a specific row of a csv file in python?

Question

I have searched like crazy trying to find specifically how to read a row in a csv file.

I need to read a random row out of 1000, each of which has 3 columns. The first column has an email. I need to put in a random email, and get columns 2 and 3 out. (Python 2.7, csv file)

Example:

Name Date  Color
Ray  May   Gray
Alex Apr   Green
Ann  Jun   Blue
Kev  Mar   Gold
Rob  May   Black

Instead of column 1 row 3, I need [Ann], her whole row. This is a CSV file, with over 1000 names. I have to put in her name and output her whole row.

What I have tried

from collections import namedtuple
Entry = namedtuple('Entry', 'Name, Date, Color')
file_location = "C:/Users/abriman/Desktop/Book.csv"
ss_dict = {}
spreadsheet = file_location = "C:/Users/abriman/Desktop/Book.csv"
for row in spreadsheet:
    entry = Entry(*tuple(row))
    ss_dict['Ann']

And my error reads

Traceback (most recent call last):
  File "<stdin>", line 2, in <module>
TypeError: __new__() takes exactly 4 arguments (2 given)

I have tried other ways too and got little to no result. I'm a beginner at python.

senshin · Accepted Answer · 2014-06-16 20:31:11Z

4

You're on the right track. First issue: you're never opening the file located at file_location. Thus, when you iterate for row in spreadsheet:, you're iterating over the characters of spreadsheet, which are the characters of file_location, which are the characters of "C:/Users/...". So the first thing you want to do is actually open the file:

spreadsheet = open(file_location, 'r')

You still have another issue in your loop. When you iterate over a file in a for loop, you get back the lines of the file. So, at each iteration, row will be a line, e.g. "Ray May Gray". When you call tuple() on that, you're going to get a tuple that looks like ('R', 'a', 'y', ' ', ' ', 'M', ...). What you need to do is construct your tuple by splitting on whitespace:

entry = Entry(*row.split())

Then, you need to add your entry to the dictionary ss_dict:

ss_dict[entry.Name] = entry

Finally, you can read out the value of ss_dict['Ann'], but this should be outside your loop - if you do it inside your loop, you may be trying to read the value of ss_dict['Ann'] before it has been set. All in all, your code should look like this:

from collections import namedtuple
Entry = namedtuple('Entry', 'Name, Date, Color')
file_location = "C:/Users/abriman/Desktop/Book.csv"
ss_dict = {}
spreadsheet = open(file_location, 'r') # <--
for row in spreadsheet:
    entry = Entry(*row.split()) # <--
    ss_dict[entry.Name] = entry # <--
print ss_dict['Ann']

Incidentally, the reason you're getting your error message there is that when you do for row in spreadsheet: with spreadsheet being a string, row is just a character, as I mentioned, and so tuple(row) is just a tuple containing one character, and hence is of length 1, so that you're only passing one argument rather than three when you do *tuple(row).

All that said, you might want to consider looking at the csv module, which is part of the standard library, and is precisely designed for reading csv files. It will probably make your life easier in the long run.

edited Jun 16, 2014 at 20:31

answered Jun 16, 2014 at 20:25

senshin

10.5k7 gold badges49 silver badges62 bronze badges

Sign up to request clarification or add additional context in comments.

2 Comments

user3726972 Over a year ago

First, you all are awesome! Great responses, and I appreicate the help. For your code, im still getting an error, --------------------{ File "<stdin>", line 4 print ss_dict['Ann'] ^ SyntaxError: invalid syntax} I have also tried indenting it, but nothing.

senshin Over a year ago

@user3726972 Do you have a caret ^ in your code? It looks like it, and that would probably be a syntax error. If the issue is something else, please post another question asking about this new issue.

Slick · Accepted Answer · 2014-06-16 20:24:17Z

3

I think what you need is enumerate

def read_csv_line(line_number, filename):
    with open("filename.csv") as fileobj
        for i, line in enumerate(fileobj):
            if i == (line_number - 1):
                return line
    return None

Then you can feed your random number and filename to get a random line.

answered Jun 16, 2014 at 20:24

Slick

3591 silver badge8 bronze badges

Comments

nefo_x · Accepted Answer · 2014-06-16 20:42:48Z

Solution to your problem could be simple dictionary comprehension:

>>> Entry = namedtuple('Entry', 'Name, Date, Color')
>>> [l for l in open('t.tsv', 'r')]
<<<
['Name Date  Color\n',
 'Ray  May   Gray\n',
 'Alex Apr   Green\n',
 'Ann  Jun   Blue\n',
 'Kev  Mar   Gold\n',
 'Rob  May   Black\n']
>>> [l.split() for l in open('t.tsv', 'r')]
<<<
[['Name', 'Date', 'Color'],
 ['Ray', 'May', 'Gray'],
 ['Alex', 'Apr', 'Green'],
 ['Ann', 'Jun', 'Blue'],
 ['Kev', 'Mar', 'Gold'],
 ['Rob', 'May', 'Black']]
>>> [Entry(*l.split()) for l in open('t.tsv', 'r')]
<<<
[Entry(Name='Name', Date='Date', Color='Color'),
 Entry(Name='Ray', Date='May', Color='Gray'),
 Entry(Name='Alex', Date='Apr', Color='Green'),
 Entry(Name='Ann', Date='Jun', Color='Blue'),
 Entry(Name='Kev', Date='Mar', Color='Gold'),
 Entry(Name='Rob', Date='May', Color='Black')]    >>> {'fooo':e for e in Entry(*l.split()) for l in open('t.tsv', 'r')}
>>> {e.Name:e for e in list(Entry(*l.split()) for l in open('t.tsv', 'r'))}
<<<
{'Alex': Entry(Name='Alex', Date='Apr', Color='Green'),
 'Ann': Entry(Name='Ann', Date='Jun', Color='Blue'),
 'Kev': Entry(Name='Kev', Date='Mar', Color='Gold'),
 'Name': Entry(Name='Name', Date='Date', Color='Color'),
 'Ray': Entry(Name='Ray', Date='May', Color='Gray'),
 'Rob': Entry(Name='Rob', Date='May', Color='Black')}

I think you are thinking on reading the first row as header names. Python has DictReader - https://docs.python.org/2/library/csv.html#csv.DictReader

>>> import csv
>>> for line in csv.DictReader(open('t.tsv')): print line # don't forget to make your file coma-separated. 
{'Date': 'May', 'Color': 'Gray', 'Name': 'Ray'}
{'Date': 'Apr', 'Color': 'Green', 'Name': 'Alex'}
{'Date': 'Jun', 'Color': 'Blue', 'Name': 'Ann'}
{'Date': 'Mar', 'Color': 'Gold', 'Name': 'Kev'}
{'Date': 'May', 'Color': 'Black', 'Name': 'Rob'}

or with dictionary comprehension:

>>> { line['Name']: line for line in csv.DictReader(open('t.tsv')) }
<<<
{'Alex': {'Color': 'Green', 'Date': 'Apr', 'Name': 'Alex'},
 'Ann': {'Color': 'Blue', 'Date': 'Jun', 'Name': 'Ann'},
 'Kev': {'Color': 'Gold', 'Date': 'Mar', 'Name': 'Kev'},
 'Ray': {'Color': 'Gray', 'Date': 'May', 'Name': 'Ray'},
 'Rob': {'Color': 'Black', 'Date': 'May', 'Name': 'Rob'}}
>>> rows_by_name = { line['Name']: line for line in csv.DictReader(open('t.tsv')) }
>>> rows_by_name['Ann']
<<< {'Color': 'Blue', 'Date': 'Jun', 'Name': 'Ann'}

If you want random samples - i suggest first reading a rows into list and then make selection through randbom module. Or... let's do it with Entry:

>>> rows = list(Entry(*l.split()) for l in open('t.tsv', 'r'))
>>> import random
>>> random.sample(rows, 1)
<<< [Entry(Name='Ray', Date='May', Color='Gray')]
>>> random.sample(rows, 1)
<<< [Entry(Name='Alex', Date='Apr', Color='Green')]
>>> random.sample(rows, 1)
<<< [Entry(Name='Name', Date='Date', Color='Color')]
>>> random.sample(rows, 1)
<<< [Entry(Name='Alex', Date='Apr', Color='Green')]
>>> random.sample(rows, 1)
<<< [Entry(Name='Alex', Date='Apr', Color='Green')]
>>> random.sample(rows, 1)
<<< [Entry(Name='Alex', Date='Apr', Color='Green')]
>>> random.sample(rows, 3)
<<<
[Entry(Name='Ray', Date='May', Color='Gray'),
 Entry(Name='Kev', Date='Mar', Color='Gold'),
 Entry(Name='Ann', Date='Jun', Color='Blue')]
>>> random.sample(rows, 3)
<<<
[Entry(Name='Ann', Date='Jun', Color='Blue'),
 Entry(Name='Rob', Date='May', Color='Black'),
 Entry(Name='Name', Date='Date', Color='Color')]
>>> random.sample(rows, 3)
<<<
[Entry(Name='Rob', Date='May', Color='Black'),
 Entry(Name='Ann', Date='Jun', Color='Blue'),
 Entry(Name='Kev', Date='Mar', Color='Gold')]

but beware, that you can load up your memory too much.

Collectives™ on Stack Overflow

How to read a specific row of a csv file in python?

3 Answers 3

2 Comments

Comments

Comments

Your Answer

Hot Network Questions

Collectives™ on Stack Overflow

3 Answers 3

2 Comments

Comments

Comments

Your Answer

Sign up or log in

Post as a guest

Related