I have a dataframe as follows:
import polars as pl
df = pl.DataFrame({'r_num':['Yes', '', 'Yes'], 'pin': ['Yes','',''],'fin':['','','']})
shape: (3, 3)
┌───────┬─────┬─────┐
│ r_num ┆ pin ┆ fin │
│ --- ┆ --- ┆ --- │
│ str ┆ str ┆ str │
╞═══════╪═════╪═════╡
│ Yes ┆ Yes ┆ │
│ ┆ ┆ │
│ Yes ┆ ┆ │
└───────┴─────┴─────┘
Here I would like to find an observation which has r_num is YES, pin is Yes and fin is EMPTY. on meeting this condition r_num and pin should be filled in as EMPTY.
df.with_columns(
pl.when((pl.col('r_num')=='Yes') & (pl.col('pin')=='Yes') & (pl.col('fin') !='Yes'))
.then(pl.col('r_num')=='')
.otherwise(pl.col('r_num'))
)
shape: (3, 3)
┌───────┬─────┬─────┐
│ r_num ┆ pin ┆ fin │
│ --- ┆ --- ┆ --- │
│ str ┆ str ┆ str │
╞═══════╪═════╪═════╡
│ false ┆ Yes ┆ │
│ ┆ ┆ │
│ Yes ┆ ┆ │
└───────┴─────┴─────┘
Why r_num is getting filled up with false?
This is how I would do is in pandas:
df_pd = df.to_pandas()
df_pd.loc[(df_pd['r_num']=='Yes') & (df_pd['pin']=='Yes') & (df_pd['fin']!='Yes'),['r_num','pin']] = ''
Expected result:
shape: (3, 3)
┌───────┬─────┬─────┐
│ r_num ┆ pin ┆ fin │
│ --- ┆ --- ┆ --- │
│ str ┆ str ┆ str │
╞═══════╪═════╪═════╡
│ ┆ ┆ │
│ ┆ ┆ │
│ Yes ┆ ┆ │
└───────┴─────┴─────┘
when/then/otherwise: stackoverflow.com/a/73718390/18559875