46

I have read the docs of DataFrame.apply

DataFrame.apply(func, axis=0, broadcast=False, raw=False, reduce=None, args=(), **kwds)¶ Applies function along input axis of DataFrame.

So, How can I apply a function to a specific column?

In [1]: import pandas as pd
In [2]: data = {'A': [1, 2, 3], 'B': [4, 5, 6], 'C': [7, 8, 9]}
In [3]: df = pd.DataFrame(data)
In [4]: df
Out[4]: 
   A  B  C
0  1  4  7
1  2  5  8
2  3  6  9
In [5]: def addOne(v):
...:        v += 1
...:        return v
...: 
In [6]: df.apply(addOne, axis=1)
Out[6]: 
   A  B   C
0  2  5   8
1  3  6   9
2  4  7  10

I want to addOne to every value in df['A'], not all columns. How can I do that with DataFrame.apply.

Thanks for help!

2
  • 1
    Avoid using apply as much as possible. If you're not sure you need to use it, you probably don't. I recommend taking a look at When should I ever want to use pandas apply() in my code?. Commented Jan 30, 2019 at 10:22
  • 1
    @coldspeed That is nice, good question and answers in depth. Commented Jan 30, 2019 at 11:34

4 Answers 4

65

The answer is,

df['A'] = df['A'].map(addOne)

and maybe you would be better to know about the difference of map, applymap, apply.

but if you insist to use apply, you could try like below.

def addOne(v):
    v['A'] += 1
    return v

df.apply(addOne, axis=1)
Sign up to request clarification or add additional context in comments.

2 Comments

can we apply same function at a time on both A and B.
@dondapati Sure, you can simply add v['B'] += 1 inside addOne function. Pandas apply function gets each row as v when axis=1.
21

One simple way would be:

df['A'] = df['A'].apply(lambda x: x+1)

4 Comments

I did your suggestion by doing: df['A'] = df['A'].apply(lambda x: datetime.fromtimestamp(float(x)/1000.)) and I got: "A value is trying to be set on a copy of a slice from a DataFrame. Try using .loc[row_indexer,col_indexer] = value instead. " Any suggestions?
@Catarina Nogueira Try adding .copy() at the very end e.g. apply(...).copy()
I don't think this is a good solution. You're mutating the DataFrame, while iterating over itself. I would frist make a copy of the DataFrame. See here: pandas.pydata.org/docs/user_guide/…
@Paul Good Suggestion. Making a copy before do UDF function can aviod some unexpected behavior.
6

For anyone else looking for a solution that allows for pipe-ing:

identity = lambda x: x

def transform_columns(df, mapper):
    return df.transform(
        {
            **{
                column: identity
                for column in df.columns
            },
            **mapper
        }
    )

# you can monkey-patch it on the pandas DataFrame (but don't have to, see below)
pd.DataFrame.transform_columns = transform_columns

(
    pd.DataFrame(data)
    .rename(columns={'A': 'A1'})   # just to demonstrate the motivation
    .transform_columns({'A1': add_one})
)

This also allows to:

pd.DataFrame(data).transform_columns({
    'A': add_one,
    'B': add_two,
})

And if you do not want to monkey-patch DataFrame, you can always use it with pipe:

pd.DataFrame(data).pipe(transform_columns, {'A': add_one})

It would be great if this was naively supported by pandas though.

The snippets above are CC0.

Comments

2

you can use .apply() with lambda function to solve this kind of problems.

Consider, your dataframe is something like this,

A | B | C
----------
1 | 4 | 7
2 | 5 | 8
3 | 6 | 9

The function which you want to apply:

def addOne(v):
v += 1
return v

So if you write your code like this,

df['A'] = df.apply(lambda x: addOne(x.A), axis=1)

You will get:

A | B | C
----------
2 | 4 | 7
3 | 5 | 8
4 | 6 | 9

Comments

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.