1

I'm finally breaking free from the shackles of SPSS and am reveling in the freedom of Pandas and Python (love it). However, I'm trying to get a clearer picture of how the python Lambda function interacts in Pandas. It seems to pop up a lot. Here is an example I hope will clear up the murkiness.

After creating a new dataframe from a string split:

 bs = fh['basis'].str.split(',',expand = True)

I want to rename all the variables by adding a "b" to the numeric headers. This works:

 n = list(bs)
 for x in n:
     bs.rename(columns={x : 'b' + str(x)},inplace = True)

But I have a sneaking suspicion a lambda function would be better. However, this doesn't work:

 bs.rename(columns=lambda x: x = 'b' + str(x), inplace=True)

I thought lambda acted as a function, so if I pass in a column header I can append a 'b' to it. But the "=" throws an error. Any quick observations would be much appreciated. Cheers!

2
  • You could have just done bs.columns = 'b' + bs.columns.astype(str) Commented Jan 26, 2017 at 22:27
  • Oh that's a nice solution too. I didn't know you could access all columns like that. Super helpful - thanks for pointing it out! Commented Jan 26, 2017 at 22:32

3 Answers 3

6

I'd use add_prefix():

In [5]: bs = pd.DataFrame(np.random.rand(3,5))

In [6]: bs
Out[6]:
          0         1         2         3         4
0  0.521593  0.088293  0.623103  0.099417  0.983149
1  0.009741  0.465654  0.414261  0.024086  0.039543
2  0.476219  0.918162  0.900815  0.126549  0.112388

In [7]: bs.add_prefix('b')
Out[7]:
         b0        b1        b2        b3        b4
0  0.521593  0.088293  0.623103  0.099417  0.983149
1  0.009741  0.465654  0.414261  0.024086  0.039543
2  0.476219  0.918162  0.900815  0.126549  0.112388
Sign up to request clarification or add additional context in comments.

Comments

5

You could've done this even easier by just adding the columns to 'b' after casting to str using astype:

In [2]:
df = pd.DataFrame(columns=np.arange(5))
df

Out[2]:
Empty DataFrame
Columns: [0, 1, 2, 3, 4]
Index: []

In [4]:
df.columns = 'b' + df.columns.astype(str)
df.columns

Out[4]:
Index(['b0', 'b1', 'b2', 'b3', 'b4'], dtype='object')

Comments

1

Ahhhh! Of course I figured it out beofre I could even submit it.

 bs.rename(columns=lambda x: 'b' + str(x), inplace=True)

Is of course the answer. The equals sign is redundant, whatever after the ":" is what the function will return or "equal". Is that correct idea?

2 Comments

That is correct. But there is a better option for this particular example: bs.add_prefix('b') ;-)
@MaxU never encountered that method before

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.