Pandas-like way to handle iloc out of bounds errors?

Question

I have an Excel report with several tables arranged in the sheet and I'm parsing it with Pandas. The key,value pairs I'm scraping out of the report are always in the same columns. So, I separted my lookups into groups where the key,values are the same, and use iloc to find the correct row:

df[df.iloc[:, key_column] == 'apple'][value_column].values[0]

Many keys are present in every file, but occasionally one is not present. In the rare event of an always-present-key not being present the whole block will fail (index 0 is out of bounds for axis 0 with size 0)

try:
  parsed_xls['fruit'] = df[df.iloc[:, key_column] == 'apple'][value_column].values[0]
  parsed_xls['vegetable'] = df[df.iloc[:, key_column] == 'onion'][value_column].values[0]
  parsed_xls['stationary'] = df[df.iloc[:, key_column] == 'stapler'][value_column].values[0]
except:
  # error reporting

Short of putting each key,value pair in it's own try...except, or a helper function to supply zero value when the key search fails... Is there a more Pandas-like way to handle iloc lookups which raise this exception (and still catch errors)?

Just to clarify, is it key_column that may not be present or value_column? Or is it just possible that there may not be any such key present in the key column? Which is it? — cs95
– cs95, Commented Mar 10, 2018 at 21:55
The key may not be present. Eg. If only food is present when the report is generated, and there are to 'staplers' to report, then the 'stapler' key is not present. — xtian
– xtian, Commented Mar 10, 2018 at 23:19

jpp · Accepted Answer · 2018-03-10 23:32:59Z

6

The short answer is "No" - and I see no reason why such functionality should exist when you can wrap your logic in a helper function.

If, as you mention, you only occasionally see IndexError, try / except is preferred to if / else.

import pandas as pd, numpy as np

df = pd.DataFrame(np.random.randint(0, 9, (1000, 10)))

res = df.loc[df.iloc[:, 20] == 6, 5].values[0]
# IndexError: index 0 is out of bounds for axis 0 with size 0

def lookup_fn(df, key_col, key_val, val_col, idx=0):
    try:
        return df[df.iloc[:, key_col] == key_val][val_col].values[idx]
    except IndexError:
        return 0

res = lookup_fn(df, 20, 6, 5)
# 0

answered Mar 10, 2018 at 23:32

jpp

166k37 gold badges301 silver badges363 bronze badges

Sign up to request clarification or add additional context in comments.

2 Comments

xtian Over a year ago

I'm just trying to get my head around writing Pandas that isn't overly complex just for these outliers. In this daily report there are both extremes--the infrequent missing, and the infrequent addition. Over 2 years working with the data I've seen additions <10 per year. I have never seen some values missing, but I have to admit its possible. I honestly didn't know if there was a common way to catch this exception. I like the helper solution.

jpp Over a year ago

I advise you to use try / except unless you see a performance drop (usually too many exceptions). If this occurs, you can easily move to an if / else statement. Something like: if key_col in range(len(df.columns)):...

Collectives™ on Stack Overflow

Pandas-like way to handle iloc out of bounds errors?

1 Answer 1

2 Comments

Your Answer

Hot Network Questions

Collectives™ on Stack Overflow

1 Answer 1

2 Comments

Your Answer

Sign up or log in

Post as a guest

Related