16

I have this dataframe

import polars as pl

df = pl.from_repr("""
┌─────┬───────┐
│ one ┆ two   │
│ --- ┆ ---   │
│ str ┆ str   │
╞═════╪═══════╡
│ a   ┆ hola  │
│ b   ┆ world │
└─────┴───────┘
""")

And I want to change hola for hello:

shape: (2, 2)
┌─────┬───────┐
│ one ┆ two   │
│ --- ┆ ---   │
│ str ┆ str   │
╞═════╪═══════╡
│ a   ┆ hello │ # <-
│ b   ┆ world │
└─────┴───────┘

How can I change the values of a row based on a condition in another column?

For instance, with PostgreSQL I could do this:

UPDATE my_table SET two = 'hello' WHERE one = 'a';

Or in Spark

my_table.withColumn("two", when(col("one") == "a", "hello"))

I've tried using with_columns(pl.when(pl.col("one") == "a").then("hello")) but that changes the column "one".

EDIT: I could create a SQL instance and plot my way through via SQL but there must be way to achieve this via the Python API.

3 Answers 3

22

You were really close with with_columns(pl.when(pl.col("one") == "a").then("hello")) but you needed to tell it which column that should be.

When you don't tell it which column you're referring to then it has to guess and in this case it guessed the column you referred to.

Instead you do

(df 
    .with_columns(
        two=pl.when(pl.col('one')=='a')
                .then(pl.lit('hello'))
                .otherwise(pl.col('two')))
)

This uses the **kwargs input of with_columns to allow the column named to be on the left of an equal sign as though it were a parameter to a function. You can also use alias syntax like this...

(df 
    .with_columns(
        (pl.when(pl.col('one')=='a')
                .then(pl.lit('hello'))
                .otherwise(pl.col('two')))
            .alias('two')
                )
)

Note that I wrapped the entire when/then/otherwise in parenthesis. The order of operations around when/then/otherwise and alias is weird so I find it's best to always completely wrap them in parenthesis to avoid unexpected results. Worst case scenario is you have redundant parenthesis which doesn't hurt anything.

Sign up to request clarification or add additional context in comments.

1 Comment

100%. This is what I was looking for. So simple but I completely missed it. Very interesting re the wrapping! Many thanks indeed
7

You need to name the expression to overwrite the existing two column, either using .alias() or a named arg .with_columns(two = ...)

You also need to provide .otherwise() as the default is .otherwise(None).

df = pl.DataFrame({"one": ["a", "b"], "two": ["hola", "world"]})

df.with_columns(
   pl.when(pl.col.one == "a")
     .then(pl.lit("hello"))
     .otherwise(pl.col.two)
     .alias("two")
)
shape: (2, 2)
┌─────┬───────┐
│ one ┆ two   │
│ --- ┆ ---   │
│ str ┆ str   │
╞═════╪═══════╡
│ a   ┆ hello │
│ b   ┆ world │
└─────┴───────┘

The .then() branch generates the name, so another option is to invert the logic and put the column inside .then() which allows polars to infer the name for you.

df.with_columns(
   pl.when(pl.col.one != "a")
     .then(pl.col.two)
     .otherwise(pl.lit("hello"))
)

shape: (2, 2)
┌─────┬───────┐
│ one ┆ two   │
│ --- ┆ ---   │
│ str ┆ str   │
╞═════╪═══════╡
│ a   ┆ hello │
│ b   ┆ world │
└─────┴───────┘

Comments

-4

Here is a different solution which uses df.filter to get the indexes of the row to be modified, then a simple for-loop to apply the transformation. I know loops are not the best for speed. This is just to give an alternative solution. It is not the best solution.

df = pl.DataFrame({'one':['a','b'], 'two':['hola','world']}).with_row_index()

s = list(df.filter(pl.col('one') == 'a')['index']) #get indexes of filtered rows
for idx in s: #change all the selected rows
    df[idx,'two'] = 'hello'


print(df)

shape: (2, 3)
┌────────┬─────┬───────┐
│ row_nr ┆ one ┆ two   │
│ ---    ┆ --- ┆ ---   │
│ u32    ┆ str ┆ str   │
╞════════╪═════╪═══════╡
│ 0      ┆ a   ┆ hello │
│ 1      ┆ b   ┆ world │
└────────┴─────┴───────┘

1 Comment

This is incredibly inefficient. Might as well use plain python.

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.