Clarification/Musing on Python Lambda function in Pandas

Question

I'm finally breaking free from the shackles of SPSS and am reveling in the freedom of Pandas and Python (love it). However, I'm trying to get a clearer picture of how the python Lambda function interacts in Pandas. It seems to pop up a lot. Here is an example I hope will clear up the murkiness.

After creating a new dataframe from a string split:

 bs = fh['basis'].str.split(',',expand = True)

I want to rename all the variables by adding a "b" to the numeric headers. This works:

 n = list(bs)
 for x in n:
     bs.rename(columns={x : 'b' + str(x)},inplace = True)

But I have a sneaking suspicion a lambda function would be better. However, this doesn't work:

 bs.rename(columns=lambda x: x = 'b' + str(x), inplace=True)

I thought lambda acted as a function, so if I pass in a column header I can append a 'b' to it. But the "=" throws an error. Any quick observations would be much appreciated. Cheers!

You could have just done bs.columns = 'b' + bs.columns.astype(str) — EdChum
– EdChum, Commented Jan 26, 2017 at 22:27
Oh that's a nice solution too. I didn't know you could access all columns like that. Super helpful - thanks for pointing it out! — Tim Gottgetreu
– Tim Gottgetreu, Commented Jan 26, 2017 at 22:32

MaxU - stand with Ukraine · Accepted Answer · 2017-01-26 22:32:27Z

6

I'd use add_prefix():

In [5]: bs = pd.DataFrame(np.random.rand(3,5))

In [6]: bs
Out[6]:
          0         1         2         3         4
0  0.521593  0.088293  0.623103  0.099417  0.983149
1  0.009741  0.465654  0.414261  0.024086  0.039543
2  0.476219  0.918162  0.900815  0.126549  0.112388

In [7]: bs.add_prefix('b')
Out[7]:
         b0        b1        b2        b3        b4
0  0.521593  0.088293  0.623103  0.099417  0.983149
1  0.009741  0.465654  0.414261  0.024086  0.039543
2  0.476219  0.918162  0.900815  0.126549  0.112388

edited Jan 26, 2017 at 22:32

answered Jan 26, 2017 at 22:30

MaxU - stand with Ukraine

212k37 gold badges402 silver badges437 bronze badges

Sign up to request clarification or add additional context in comments.

Comments

EdChum · Accepted Answer · 2017-01-26 22:28:35Z

5

You could've done this even easier by just adding the columns to 'b' after casting to str using astype:

In [2]:
df = pd.DataFrame(columns=np.arange(5))
df

Out[2]:
Empty DataFrame
Columns: [0, 1, 2, 3, 4]
Index: []

In [4]:
df.columns = 'b' + df.columns.astype(str)
df.columns

Out[4]:
Index(['b0', 'b1', 'b2', 'b3', 'b4'], dtype='object')

answered Jan 26, 2017 at 22:28

EdChum

397k204 gold badges837 silver badges583 bronze badges

Comments

Tim Gottgetreu · Accepted Answer · 2017-01-26 22:23:56Z

1

Ahhhh! Of course I figured it out beofre I could even submit it.

 bs.rename(columns=lambda x: 'b' + str(x), inplace=True)

Is of course the answer. The equals sign is redundant, whatever after the ":" is what the function will return or "equal". Is that correct idea?

answered Jan 26, 2017 at 22:23

Tim Gottgetreu

4951 gold badge9 silver badges22 bronze badges

2 Comments

MaxU - stand with Ukraine Over a year ago

That is correct. But there is a better option for this particular example: bs.add_prefix('b') ;-)

EdChum Over a year ago

@MaxU never encountered that method before

Collectives™ on Stack Overflow

Clarification/Musing on Python Lambda function in Pandas

3 Answers 3

Comments

Comments

2 Comments

Your Answer

Hot Network Questions

Collectives™ on Stack Overflow

3 Answers 3

Comments

Comments

2 Comments

Your Answer

Sign up or log in

Post as a guest

Related