2

I'm new to programming, and experimenting with Python 3. I've found a few topics which deal with IndexError but none that seem to help with this specific circumstance.

I've written a function which opens a text file, reads it one line at a time, and slices the line up into individual strings which are each appended to a particular list (one list per 'column' in the record line). Most of the slices are multiple characters [x:y] but some are single characters [x].

I'm getting an IndexError: string index out of range message, when as far as I can tell, it isn't. This is the function:

def read_recipe_file():
    recipe_id = []
    recipe_book = []
    recipe_name = []
    recipe_page = []
    ingred_1 = []
    ingred_1_qty = []
    ingred_2 = []
    ingred_2_qty = []
    ingred_3 = []
    ingred_3_qty = []

    f = open('recipe-file.txt', 'r')  # open the file 
    for line in f:
        # slice out each component of the record line and store it in the appropriate list
        recipe_id.append(line[0:3])
        recipe_name.append(line[3:23])
        recipe_book.append(line[23:43])
        recipe_page.append(line[43:46])
        ingred_1.append(line[46]) 
        ingred_1_qty.append(line[47:50])
        ingred_2.append(line[50]) 
        ingred_2_qty.append(line[51:54])
        ingred_3.append(line[54]) 
        ingred_3_qty.append(line[55:])
    f.close()
return recipe_id, recipe_name, recipe_book, recipe_page, ingred_1, ingred_1_qty, ingred_2, ingred_2_qty, ingred_3, \
       ingred_3_qty

This is the traceback:

Traceback (most recent call last):
  File "recipe-test.py", line 84, in <module>
    recipe_id, recipe_book, recipe_name, recipe_page, ingred_1, ingred_1_qty, ingred_2, ingred_2_qty, ingred_3, ingred_3_qty = read_recipe_file()
  File "recipe-test.py", line 27, in read_recipe_file
    ingred_1.append(line[46])

The code which calls the function in question is:

print('To show list of recipes: 1')
print('To add a recipe: 2')
user_choice = input()
recipe_id, recipe_book, recipe_name, recipe_page, ingred_1, ingred_1_qty, ingred_2, ingred_2_qty, \
ingred_3, ingred_3_qty = read_recipe_file()

if int(user_choice) == 1:
    print_recipe_table(recipe_id, recipe_book, recipe_name, recipe_page, ingred_1, ingred_1_qty,
                    ingred_2, ingred_2_qty, ingred_3, ingred_3_qty)

elif int(user_choice) == 2:
    #code to add recipe

The failing line is this:

ingred_1.append(line[46])

There are more than 46 characters in each line of the text file I am trying to read, so I don't understand why I'm getting an out of bounds error (a sample line is below). If I change to the code to this:

ingred_1.append(line[46:])

to read a slice, rather than a specific character, the line executes correctly, and the program fails on this line instead:

ingred_2.append(line[50])

This leads me to think it is somehow related to appending a single character from the string, rather than a slice of multiple characters.

Here is a sample line from the text file I am reading:

001Cheese on Toast     Meals For Two       012120038005002

I should probably add that I'm well aware this isn't great code overall - there are lots of ways I could generally improve the program, but as far as I can tell the code should actually work.

7
  • 1
    Are there any empty lines? That would cause this error. Commented Sep 19, 2015 at 12:09
  • Are there tabs in the input file? Try printing the line length. Commented Sep 19, 2015 at 12:13
  • I think unutbu nailed it - there was an extra newline at the end of the source text file. Deleting it revealed another error in the appending code (forgot the [i] at the end of some of the list names) - but when I fixed THAT, everything worked as expected. :) Commented Sep 19, 2015 at 12:15
  • line[100000:] is always legitimate, regardless of the line length. You may want to add a try except block and print len(line) on exception. Commented Sep 19, 2015 at 12:16
  • 1
    @RichCairns: I'm glad you've solved the problem. Feel free to accept one of the answers already posted. Commented Sep 19, 2015 at 12:52

2 Answers 2

2

This will happen if some of the lines in the file are empty or at least short. A stray newline at the end of the file is a common cause, since that comes up as an extra blank line. The best way to debug a case like this is to catch the exception, and investigate the particular line that fails (which almost certainly won't be the sample line you reproduced):

try:
    ingred_1.append(line[46])
except IndexError:
    print(line)
    print(len(line))

Catching this exception is also usually the right way to deal with the error: you've detected a pathological case, and now you can consider what to do. You might for example:

  • continue, which will silently skip processing that line,
  • Log something and then continue
  • Bail out by raising a new, more topical exception: eg raise ValueError("Line too short").

Printing something relevant, with or without continuing, is almost always a good idea if this represents a problem with the input file that warrants fixing. Continuing silently is a good option if it is something relatively trivial, that you know can't cause flow-on errors in the rest of your processing. You may want to differentiate between the "too short" and "completely empty" cases by detecting the "completely empty" case early such as by doing this at the top of your loop:

if not line:
    # Skip blank lines
    continue

And handling the error for the other case appropriately.


The reason changing it to a slice works is because string slices never fail. If both indexes in the slice are outside the string (in the same direction), you will get an empty string - eg:

>>> 'abc'[4]
Traceback (most recent call last):
  File "<stdin>", line 1, in <module>
IndexError: string index out of range
>>> 'abc'[4:]
''
>>> 'abc'[4:7]
''
Sign up to request clarification or add additional context in comments.

Comments

0

Your code fails on line[46] because line contains fewer than 47 characters. The slice operation line[46:] still works because an out-of-range string slice returns an empty string.

You can verify that the line is too short by replacing

ingred_1.append(line[46])

with

try:
    ingred_1.append(line[46])
except IndexError:
    print('line = "%s", length = %d' % (line, len(line)))

2 Comments

Thanks @michael - but line definitely contain more than 47 characters - I posted an example line from the source file in the OP. What I neglected to add (my bad) was that all the lines in the source file are exactly the same length. The commenter in the OP has nailed it, I think - there was an extra newline at the end of the source file.
No, line[46] wouldn't cause an IndexError if line contained at least 47 characters. Note that an empty line contains fewer than 47 characters. You'll see the problem if you revert to the previous version of the input file and implement the try ... catch construct.

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.