1

I am trying to read a CSV file into a list and then sort it based on the first two columns of the list (first by first column and then by second column if the first column is the same). This is what I am doing:

def sortcsvfiles(inputfilename,outputfilename):
    list1=[]
    row1=[]
    with open(inputfilename,'rt') as csvfile1:
        reader=csv.reader(csvfile1)
        cnt=0       
        for row in reader:
            if cnt==0:        #skip first row as it contains header information
                row1=row
                cnt+=1
                continue    
            list1.append((row)) 

        list1.sort(key=lambda ro: (int(ro[0]),int(ro[1])))

    list1.insert(0, row1)
    with open(outputfilename,'wt') as csvfile1:
        writer=csv.writer(csvfile1, lineterminator='\n')
        for row in list1:
            writer.writerow(row)

But I am getting the following error:

  File "C:\Users\50004182\Documents\temp.py", line 37, in <lambda>
    list1.sort(key=lambda ro: (int(ro[0]),int(ro[1])))
IndexError: list index out of range

How can I fix this?

1
  • Please provide a sample copy of your CSV file as well. Commented Dec 25, 2015 at 12:37

2 Answers 2

4

You have probably an empty line in your file. Perhaps the last one. For example, you can just ignore empty lines:

def sortcsvfiles(inputfilename,outputfilename):
    with open(inputfilename,'rt') as csvfile:
        reader = csv.reader(csvfile)
        header = next(reader)
        data = [row for row in reader if row] # ignore empty lines
        data.sort(key=lambda ro: (int(ro[0]),int(ro[1])))

    with open(outputfilename,'wt') as csvfile:
        writer=csv.writer(csvfile, lineterminator='\n')
        writer.writerow(header)
        writer.writerows(data)
Sign up to request clarification or add additional context in comments.

Comments

2

The error occurs because you have at least one row that does not have 2 columns. It may have 1 or even 0 instead.

You could test for this before appending the row:

if len(row) > 1:
    list1.append(row) 

To sort all rows but skip the first header, you can use the next() function (see a previous answer of mine); using the sorted() function perhaps:

def sortcsvfiles(inputfilename, outputfilename):
    with open(inputfilename,'rt') as csvfile1:
        reader = csv.reader(csvfile1)
        headers = next(reader, None)  # get one row, or None if there are no rows
        rows = sorted(
            (r for r in reader if len(r) > 1),
            key=lambda r: (int(r[0]), int(r[1])))

    with open(outputfilename,'wt') as csvfile1:
        writer = csv.writer(csvfile1, lineterminator='\n')
        if headers:
            writer.writerow(headers)
        writer.writerows(rows)

I used writer.writerows() to write the whole list of sorted rows in one call.

6 Comments

I don't think there are any rows with length less than 2. I tried your solution but still it gives the same error.
@Noober: that's the only reason the exception occurs; if r[0] or r[1] doesn't exist. Note that the traceback shows the exception occurs within the lambda.
@Noober: please share the traceback (in a pastie) of the new exception with my change applied. I strongly doubt that my code will give the exact same exception.
Both the errors are exactly the same. Here is the pastie link-pastie.org/private/xjufbu45whc3nl95xzwaw
@Noober: that's not quite my code. You called sorted on list1. I called sorted() on a generator expression that filters out any rows shorter than 2 elements. Can you share the code (in a pastie again) that throws that exception?
|

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.