How to handle CSV with variable columns per row

Question

I've got a file that has a header row with a fixed number of labels and rows of variable length. The last column has a field that should really be a sublist of items, but this list is treated as a continuum of columns.

Example:

Name, Address, Telephone
"Bob Smith", "123 main st", "111-111-1111"
"Jon Smith", "123 main st", "111-111-1111", "222-222-2222"

I ultimately want to iterate over the sublist, in this case telephone #'s.

I've tried using csv dictreader but it drops the extra columns.

Thanks in advance.

Always good to post code to show an attempt. Pandas has awesome excel tools. — doedotdev
– doedotdev, Commented Jul 5, 2018 at 20:33
use some other delimiter instead of a comma in a value. as soon as csv encounters a comma, it splits. that's' the way it works! — rkatkam
– rkatkam, Commented Jul 5, 2018 at 20:36

Mark Tolonen · Accepted Answer · 2018-07-05 20:36:42Z

3

You don't need DictReader. Use the standard reader and tuple assignment syntax:

Code:

import csv

with open('test.csv') as f:
    r = csv.reader(f)
    next(r) # skip header

    # Note this assigns the 3rd and remaining columns to 'telephone' as a list.
    for name,addr,*telephone in r:
        print(f'name:     {name}')
        print(f'address:  {addr}')
        for i,phone in enumerate(telephone,1):
            print(f'Phone #{i}: {phone}')
        print()

test.csv:

Name,Address,Telephone
"Bob Smith","123 main st","111-111-1111"
"Jon Smith","123 main st","111-111-1111","222-222-2222"

Output:

name:     Bob Smith
address:  123 main st
Phone #1: 111-111-1111

name:     Jon Smith
address:  123 main st
Phone #1: 111-111-1111
Phone #2: 222-222-2222

answered Jul 5, 2018 at 20:36

Mark Tolonen

181k26 gold badges183 silver badges279 bronze badges

Sign up to request clarification or add additional context in comments.

5 Comments

yanir Over a year ago

This looks great, however I get an error on the asterisk *telephone.

Mark Tolonen Over a year ago

@yanir FYI I used Python 3.6, but I have Python 3.3 installed and it also has the feature. I'm not sure when it was added.

Mark Tolonen Over a year ago

@yanir PEP 3132 describes the feature and it was implemented in Python 3.0.

Mark Tolonen Over a year ago

@yanir If that tripped you up, you'll also want Python 3.6 for the f-strings (See PEP 0498) in the example.

yanir Over a year ago

@MarkTolonen I was on 2.7. I installed 3.7 and it works now. Thanks!

nosklo · Accepted Answer · 2018-07-05 20:32:48Z

1

As you can see in DictReader docs:

If a row has more fields than fieldnames, the remaining data is put in a list and stored with the fieldname specified by restkey (which defaults to None).

All you must do is pass the restkey parameter and all your extra values will go there.

with open('yourfile.csv') as f:
    cf = csv.DictReader(f, restkey='extra')
    for row in cf:
        print(row)

will print

{"Name": "Bob Smith", "Address": "123 main st", "Telephone": "111-111-1111"}
{"Name": "Jon Smith", "Address": "123 main st", "Telephone": "111-111-1111", "extra": ["222-222-2222"]}

answered Jul 5, 2018 at 20:32

nosklo

224k58 gold badges300 silver badges299 bronze badges

Collectives™ on Stack Overflow

How to handle CSV with variable columns per row

2 Answers 2

5 Comments

Comments

Your Answer

Hot Network Questions

Collectives™ on Stack Overflow

2 Answers 2

5 Comments

Comments

Your Answer

Sign up or log in

Post as a guest

Related