What I wanna do
I want to do RFM analytics for purchase data of a e-commerce site.
I processed the data into RFM format, so I want to rank every ID depending on the values of each column (Money, Recency and Frequency).
However, I got the error message as below.
---------------------------------------------------------------------------
ValueError Traceback (most recent call last)
<ipython-input-15-e7bf5ddc856d> in <module>
13 return 5
14
---> 15 rfm['money rank'] = rfm['money'].apply(money)
16 rfm.head()
c:\users\lib\site-packages\pandas\core\frame.py in apply(self, func, axis, raw, result_type, args, **kwds)
7766 kwds=kwds,
7767 )
-> 7768 return op.get_result()
7769
7770 def applymap(self, func, na_action: Optional[str] = None) -> DataFrame:
c:\users\lib\site-packages\pandas\core\apply.py in get_result(self)
183 return self.apply_raw()
184
--> 185 return self.apply_standard()
186
187 def apply_empty_result(self):
c:\users\lib\site-packages\pandas\core\apply.py in apply_standard(self)
274
275 def apply_standard(self):
--> 276 results, res_index = self.apply_series_generator()
277
278 # wrap results
c:\users\lib\site-packages\pandas\core\apply.py in apply_series_generator(self)
288 for i, v in enumerate(series_gen):
289 # ignore SettingWithCopy here in case the user mutates
--> 290 results[i] = self.f(v)
291 if isinstance(results[i], ABCSeries):
292 # If we have a view on v, we need to make a copy because
<ipython-input-15-e7bf5ddc856d> in money(a)
1 def money(a):
----> 2 if a < 1000:
3 return 0
4 if (1000 <= a) & (a < 2000):
5 return 1
c:\users\lib\site-packages\pandas\core\generic.py in __nonzero__(self)
1440 @final
1441 def __nonzero__(self):
-> 1442 raise ValueError(
1443 f"The truth value of a {type(self).__name__} is ambiguous. "
1444 "Use a.empty, a.bool(), a.item(), a.any() or a.all()."
ValueError: The truth value of a Series is ambiguous. Use a.empty, a.bool(), a.item(), a.any() or a.all().
Data
```
money recency frequency
sum <lambda> len
ID
100 2674 169 days 1
101 19760 98 days 3
103 2674 167 days 1
109 7904 56 days 3
11 2674 211 days 1
<class 'pandas.core.frame.DataFrame'>
Index: 290 entries, 100 to 99
Data columns (total 3 columns):
# Column Non-Null Count Dtype
--- ------ -------------- -----
0 (money, sum) 290 non-null int64
1 (recency, <lambda>) 290 non-null timedelta64[ns]
2 (freqency, len) 290 non-null int64
dtypes: int64(2), timedelta64[ns](1)
memory usage: 9.1+ KB
```
Code
```
def money(a):
if a < 1000:
return 0
if (1000 <= a) & (a < 2000):
return 1
if (2000 <= a) & (a < 3000):
return 2
if (3000 <= a) & (a < 4000):
return 3
if (4000 <= a) & (a < 5000):
return 4
if a >= 5000:
return 5
rfm['money rank'] = rfm['money'].apply(money)
```
I tried different types of (), but all of them never worked.
If you could help me out, I'd be so grateful. Thank you in advance!!!