Pandas Dataframe apply() method provides a row object, but how do you access the index value

Question

I am new to Panda's and DataFrames and have run into an issue. The DataFrame.apply() method passes a row parameter to the provided function. However I can't seem to find out what the index value corresponding to that row is from this row parameter.

An example

df = DataFrame ({'a' : np.random.randn(6),
         'b' : ['foo', 'bar'] * 3,
         'c' : np.random.randn(6)})

df = df.set_index('a')

def my_test2(row):
   return "{}.{}".format(row['a'], row['b'])

df['Value'] = df.apply(my_test2, axis=1)

Yields a KeyError

KeyError: ('a', u'occurred at index -1.16119852166')

The problem is that the row['a'] in the my_test2 method fails. If I don't do the df.set_index('a') it works fine, but I do want to have an index on a.

I tried duplicating column a (once as index, and once as a column) and this works, but this just seems ugly and problematic.

Any ideas on how to get the corresponding index value given the row object?

Many thanks in advance.

That particular error is coming about because you typed df.index(b) instead of df = df.set_index("b"), which is why you're getting a NameError instead of a KeyError. (Fixing that won't solve your problem, but it will make this question make more sense..) — DSM
– DSM, Commented Jul 11, 2014 at 12:58
Thanks for the comment and it was wrong - my bad posting before 1st cup of tea. Have fixed post. — Paul H
– Paul H, Commented Jul 11, 2014 at 13:35

BKay · Accepted Answer · 2016-10-21 23:35:18Z

5

I believe what you want is this:

def my_test(row):
   return "{}.{}".format(row.name, row['b'])

THis works because:

"{}.{}".format("ham", "cheese")

returns

'ham.cheese'

and if you reference a single row, the name attribute returns the index. For the example above:

df.iloc[0].name

returns

b                           foo
c                      1.417726
Value    0.7842562355491481.foo
Name: 0.784256235549, dtype: object

Therefore this function is equivalent to finding the index of the ith row and executing this command

"{}.{}".format(df.iloc[i].name, df.iloc[i]['b'])

then the apply function does this for all rows.

edited Oct 21, 2016 at 23:35

answered Jul 11, 2014 at 16:31

BKay

1,4771 gold badge16 silver badges27 bronze badges

Sign up to request clarification or add additional context in comments.

1 Comment

BKay Over a year ago

Hopefully that helps.

Collectives™ on Stack Overflow

Pandas Dataframe apply() method provides a row object, but how do you access the index value

1 Answer 1

1 Comment

Your Answer

Hot Network Questions

Collectives™ on Stack Overflow

1 Answer 1

1 Comment

Your Answer

Sign up or log in

Post as a guest

Related