0

I have a dataframe called df_freight and I would like to create a new column called "LM", based on a condition in another column called "Cost rate". The condition is: if it contains code "lm" right "lm" otherwise "not lm".

df_freight =pd.DataFrame(
     {'Cost rate': ['11.53 LM', '12.22kg','22 LM','sdfdfsdf'],
     'TO Number': ['x12', 'x13','x14','x15']})


df_fright["LM"] = df_fright.apply(lambda row: "LM" if row["Cost rate"].str.contans("lm") else "Not lm", axis=1)

but I am getting attribute error

   ---------------------------------------------------------------------------
AttributeError                            Traceback (most recent call last)
<ipython-input-94-d277ddb08fc7> in <module>
----> 1 df_fright["LM"] = df_fright.apply(lambda row: "LM" if row["Cost rate"].str.contans("lm") else "Not lm", axis=1)

~\Anaconda3\envs\general\lib\site-packages\pandas\core\frame.py in apply(self, func, axis, raw, result_type, args, **kwds)
   7766             kwds=kwds,
   7767         )
-> 7768         return op.get_result()
   7769 
   7770     def applymap(self, func, na_action: Optional[str] = None) -> DataFrame:

~\Anaconda3\envs\general\lib\site-packages\pandas\core\apply.py in get_result(self)
    183             return self.apply_raw()
    184 
--> 185         return self.apply_standard()
    186 
    187     def apply_empty_result(self):

~\Anaconda3\envs\general\lib\site-packages\pandas\core\apply.py in apply_standard(self)
    274 
    275     def apply_standard(self):
--> 276         results, res_index = self.apply_series_generator()
    277 
    278         # wrap results

~\Anaconda3\envs\general\lib\site-packages\pandas\core\apply.py in apply_series_generator(self)
    288             for i, v in enumerate(series_gen):
    289                 # ignore SettingWithCopy here in case the user mutates
--> 290                 results[i] = self.f(v)
    291                 if isinstance(results[i], ABCSeries):
    292                     # If we have a view on v, we need to make a copy because

<ipython-input-94-d277ddb08fc7> in <lambda>(row)
----> 1 df_fright["LM"] = df_fright.apply(lambda row: "LM" if row["Cost rate"].str.contans("lm") else "Not lm", axis=1)

AttributeError: 'str' object has no attribute 'str'

isn't the syntax correct?

2
  • Also, df_fright is not defined, but that seems to be just a typo. In the future, please make a minimal reproducible example. Commented Mar 7, 2022 at 17:03
  • 1
    You have a typo: contans("lm") => missing i Commented Mar 7, 2022 at 17:11

3 Answers 3

2

row["Cost rate"] is already a string, so you don't have to use .str. Also to check if a substring is contained in a string use in instead of contains().

import pandas as pd

df_freight = pd.DataFrame(
    {'Cost rate': ['11.53 LM', '12.22kg', '22 LM', 'sdfdfsdf'],
     'TO Number': ['x12', 'x13', 'x14', 'x15']})

df_freight["LM"] = df_freight.apply(lambda row: "LM" if "lm" in row["Cost rate"] else "Not lm", axis=1)

print(df_freight)
>   Cost rate TO Number      LM
  0  11.53 LM       x12  Not lm
  1   12.22kg       x13  Not lm
  2     22 LM       x14  Not lm
  3  sdfdfsdf       x15  Not lm

Returns False for all because comparing strings is case-sensitive. So you have to add .lower() to compare them:

import pandas as pd

df_freight = pd.DataFrame(
    {'Cost rate': ['11.53 LM', '12.22kg', '22 LM', 'sdfdfsdf'],
     'TO Number': ['x12', 'x13', 'x14', 'x15']})

df_freight["LM"] = df_freight.apply(lambda row: "LM" if "lm" in row["Cost rate"].lower() else "Not lm", axis=1)

print(df_freight)
>   Cost rate TO Number      LM
  0  11.53 LM       x12      LM
  1   12.22kg       x13  Not lm
  2     22 LM       x14      LM
  3  sdfdfsdf       x15  Not lm
Sign up to request clarification or add additional context in comments.

2 Comments

Thanks, I thought I needed str since the type was object. Got another error due to type when directly applying ur code, so I changed the type to str and it worked well. thanks for .lower() tip, i didn't think of it and results would have bee not complete without it
In the example data all values in df_freight["Cost rate"] contain a valid string, I assume in reality they could contain something like a number of NaN, especially if they were read from a file before and the type automatically determined. Then .lower() would throw an error. So adding .str makes sense. For some reason when I try this exact code adding .str causes an error, but wrapping it as str(row["Cost rate"]) does the trick, basically two ways to write the same thing I guess.
1

You can use a vectorized operation:

df_freight["LM"] = np.where(df_freight['Cost rate'].str.contains('lm', case=False),
                            'LM', 'Not lm')
print(df)

# Output
  Cost rate TO Number      LM
0  11.53 LM       x12      LM
1   12.22kg       x13  Not lm
2     22 LM       x14      LM
3  sdfdfsdf       x15  Not lm

2 Comments

thanks for solution, i should push myself to use numpy more often, instead of trying to avoid it
This is the best solution, but it doesn't directly answer OP's question. Could you add a short explanation? Something like, "row["Cost rate"] is a string, so it doesn't have a str attribute. You want to use the whole column: df_freight['Cost rate'].str. Then you can use a vectorized operation:"
1

Use this code instead. This will work also this is easiest one mentioned here. No need for str, no need of axis, no need for vectorize, no need for anything simple and easy. enjoy!

df_freight['LM'] = df_freight['Cost rate'].apply(lambda x: 'LM' if "LM" in x else "Not lm")

Output

    Cost rate   TO Number   LM
0   11.53 LM         x12    LM
1   12.22kg          x13    Not lm
2   22 LM            x14    LM
3   sdfdfsdf         x15    Not lm

Comments

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.