I am wanting to experiment with the raw=True option in the pandas apply function, as per p. 155 in High Performance Python, by Gorelick and Ozsvald. However, Python is apparently regarding the raw=True as an argument for the function I'm applying, and not for the .apply function itself! Here's a MWE:
import pandas as pd
df = pd.DataFrame(columns=('a', 'b'))
df.loc[0] = (1, 2)
df.loc[1] = (3, 4)
df['a'] = df['a'].apply(str, raw=True)
When I try to execute this, I get the following error:
TypeError: 'raw' is an invalid keyword argument for str()
The problem stays there even if I use a lambda expression:
df['a'] = df['a'].apply(lambda x: str(x), raw=True)
The problem remains if I call a custom-defined function instead of str.
How do I get Pandas to recognize that raw=True is an argument for .apply and NOT str?
pd.Series.applyand notpd.DataFrame.apply. Series doesn't seem to acceptrawas argument. Trydf.apply(str, raw=True). Is that what you are searching for ?df[['a']].apply(str, raw=True)raw=Truewith Series, becausepd.Series.applyalready passes raw values.raw=Trueis useful forpd.DataFrame.applybecause it passesnumpyarrays instead, which depending on your function can improve performance. As you can see in the documentation, there is noraw=Trueargument for a Series.