10

I am pretty new to python. I need to create a class that loads csv data into a dictionary.

I want to be able to control the keys and value So let say the following code, I can pull out worker1.name or worker1.age anytime i want.

class ageName(object):
'''class to represent a person'''
def __init__(self, name, age):
self.name = name
self.age = age

worker1 = ageName('jon', 40)
worker2 = ageName('lise', 22)

#Now if we print this you see that it`s stored in a dictionary
print worker1.__dict__
print worker2.__dict__
#
'''
{'age': 40, 'name': 'jon'}
#
{'age': 22, 'name': 'lise'}
#
'''
#

#when we call (key)worker1.name we are getting the (value)
print worker1.name
#
'''
#
jon
#
'''

But I am stuck at loading my csv data into keys and value.

[1] I want to create my own keys worker1 = ageName([name],[age],[id],[gender])

[2] each [name],[age],[id] and [gender] comes from specific a column in a csv data file

I really do not know how to work on this. I tried many methods but I failed. I need some helps to get started on this.

---- Edit This is my original code

import csv

# let us first make student an object

class Student():
    def __init__(self):
        self.fname = []
        self.lname = []
        self.ID = []
        self.sport = []
        # let us read this file
        for row in list(csv.reader(open("copy-john.csv", "rb")))[1:]:
            self.fname.append(row[0])
            self.lname.append(row[1])   
            self.ID.append(row[2])
            self.sport.append(row[3])
    def Tableformat(self):
        print "%-14s|%-10s|%-5s|%-11s" %('First Name','Last Name','ID','Favorite Sport')
        print "-" * 45
        for (i, fname) in enumerate(self.fname):
           print "%-14s|%-10s|%-5s|%3s" %(fname,self.lname[i],self.ID[i],self.sport[i])
    def Table(self):
        print self.lname

class Database(Student):
    def __init__(self):
        g = 0
        choice = ['Basketball','Football','Other','Baseball','Handball','Soccer','Volleyball','I do not like sport']
        data = student.sport
        k = len(student.fname)
        print k
        freq = {}
        for i in data:
            freq[i] = freq.get(i, 0) + 1
        for i in choice:
            if i not in freq:
                freq[i] = 0
            print i, freq[i]


student = Student()
database = Database()

This is my current code (incomplete)

import csv
class Student(object):
    '''class to represent a person'''
    def __init__(self, lname, fname, ID, sport):
        self.lname = lname
        self.fname = fname
        self.ID = ID
        self.sport = sport
reader = csv.reader(open('copy-john.csv'), delimiter=',', quotechar='"')
student = [Student(row[0], row[1], row[2], row[3]) for row in reader][1::]
print "%-14s|%-10s|%-5s|%-11s" %('First Name','Last Name','ID','Favorite Sport')
print "-" * 45
for i in range(len(student)):
    print "%-14s|%-10s|%-5s|%3s" %(student[i].lname,student[i].fname,student[i].ID,student[i].sport)

choice = ['Basketball','Football','Other','Baseball','Handball','Soccer','Volleyball','I do not like sport']
lst = []
h = 0
k = len(student)
# 23
for i in range(len(student)):
    lst.append(student[i].sport) # merge together

for a in set(lst):
    print a, lst.count(a)

for i in set(choice):
    if i not in set(lst):
        lst.append(i)
        lst.count(i) = 0
        print lst.count(i)
6
  • Note that if you really want a dictionary, you can't use worker1.name to get the values. Dictionaries are accessed using the form worker1['name']. So, which do you really want? Commented Dec 14, 2009 at 0:13
  • Hi Peter. I am sorry and I really appreciate your comment. That's a good question. Any pro and cons? I am sorry... Commented Dec 14, 2009 at 0:18
  • There are always pros and cons, but you asked for a dictionary. Do you mean you don't know whether you should use one or not? To answer that, we'd need to understand more about what you're going to do with the data. Commented Dec 14, 2009 at 0:54
  • I suspect there is just some confusion due to the fact that all the instance data is stored in a dictionary belonging to the instance. Commented Dec 14, 2009 at 1:01
  • I just edited my post. You can see my original and current codes. I am creating a small program that makes STUDENT an object, with attributes like gender, name. Using Tor Valamo's code (currently) is a good idea for some of the stuff. However, as I go down to other tasks, I found myself repeatedly stating for i in range loop just to pull out the entire student.fnames, student.lnames, student.ID. Commented Dec 14, 2009 at 1:52

4 Answers 4

12
import csv

reader = csv.reader(open('workers.csv', newline=''), delimiter=',', quotechar='"')
workers = [ageName(row[0], row[1]) for row in reader]

workers now has a list of all the workers

>>> workers[0].name
'jon'

added edit after question was altered

Is there any reason you're using old style classes? I'm using new style here.

class Student:
    sports = []
    def __init__(self, row):
       self.lname, self.fname, self.ID, self.sport = row
       self.sports.append(self.sport)
    def get(self):
       return (self.lname, self.fname, self.ID, self.sport)

reader = csv.reader(open('copy-john.csv'), delimiter=',', quotechar='"')
print "%-14s|%-10s|%-5s|%-11s" % tuple(reader.next()) # read header line from csv
print "-" * 45
students = list(map(Student, reader)) # read all remaining lines
for student in students:
    print "%-14s|%-10s|%-5s|%3s" % student.get()

# Printing all sports that are specified by students
for s in set(Student.sports): # class attribute
    print s, Student.sports.count(s)

# Printing sports that are not picked 
allsports = ['Basketball','Football','Other','Baseball','Handball','Soccer','Volleyball','I do not like sport']
for s in set(allsports) - set(Student.sports):
    print s, 0

Hope this gives you some ideas of the power of python sequences. ;)

edit 2, shortened as much as possible... just to show off :P

Ladies and gentlemen, 7(.5) lines.

allsports = ['Basketball','Football','Other','Baseball','Handball',
             'Soccer','Volleyball','I do not like sport']
sports = []
reader = csv.reader(open('copy-john.csv'))
for row in reader:
    if reader.line_num: sports.append(s[3])
    print "%-14s|%-10s|%-5s|%-11s" % tuple(s)
for s in allsports: print s, sports.count(s)
Sign up to request clarification or add additional context in comments.

14 Comments

Wooo this is exactly what I was looking for reader = csv.reader(open('Book.csv'), delimiter=',', quotechar='"') workers = [ageName(row[0], row[1], row[2]) for row in reader] for i in range(len(workers)): print workers[i].lname, workers[i].fname, workers[i].ID
you can't define "print" as a method and use it as a statement on the same module.
I was thinking about it as I wrote it, and I thought it could maybe work. But I changed it now.
wooo this is cool. I am new to python so I don't know what new / old style is. I will have to to learn it over the break. Can you help me out with one more thing? Your code is great but gives two errors: Traceback (most recent call last): File "D:/Python26/test2.py", line 13, in <module> students = [Student(row) for row in reader] File "D:/Python26/test2.py", line 5, in init (self.lname,self.fname,self.ID,self.sport) = row ValueError: too many values to unpack
oh actually if i add [1::] back will give this error instead Traceback (most recent call last): File "D:/Python26/test2.py", line 13, in <module> students = [Student(row) for row in reader[1::]] TypeError: '_csv.reader' object is unsubscriptable
|
9

I know this is a pretty old question, but it's impossible to read this, and not think of the amazing new(ish) Python library, pandas. Its main unit of analysis is a think called a DataFrame which is modelled after the way R handles data.

Let's say you have a (very silly) csv file called example.csv which looks like this:

day,fruit,sales
Monday,Banana,10
Monday,Orange,20
Tuesday,Banana,12
Tuesday,Orange,22

If you want to read in a csv in double-quick time, and do 'stuff' with it, you'd be hard pressed to beat the following code for either brevity or ease of use:

>>> import pandas as pd
>>> csv = pd.read_csv('example.csv')
>>> csv
       day   fruit  sales
0   Monday  Banana     10
1   Monday  Orange     20
2  Tuesday  Banana     12
3  Tuesday  Orange     22
>>> csv[csv.fruit=='Banana']
       day   fruit  sales
0   Monday  Banana     10
2  Tuesday  Banana     12
>>> csv[(csv.fruit=='Banana') & (csv.day=='Monday')]
      day   fruit  sales
0  Monday  Banana     10

In my opinion, this is really fantastic stuff. Never iterate over a csv.reader object again!

2 Comments

Pretty nice. Thansk. Oh, after what? Almost 4 years! :)
This is the crazy thing about the online world: it never dies! I came across this question in a perfectly normal Googling session. Hope you check out pandas and enjoy it!
8

I second Mark's suggestion. In particular, look at DictReader from csv module that allows reading a comma separated (or delimited in general) file as a dictionary.

Look at PyMotW's coverage of csv module for a quick reference and examples of usage of DictReader, DictWriter

2 Comments

I don't understand why this got a -1 without an explanatory comment. What's wrong with a suggestion to use a DictReader?
I have to say, the link you posted was quite useful =). Going to vote you up on that. (Mod's, I assuming that's ok?)
2

Have you looked at the csv module?

import csv

1 Comment

Yes I did. In fact I have a poorly-written version, but I realize it was a pain in the neck so I decided to do dictionary right away.

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.