2

Is it possible in python polars to transform the root_names of expression meta data? E.g. if I have an expression like

expr = pl.col("A").dot(pl.col("B")).alias("AdotB")

to add suffixes to the root_names, e.g. transforming the expression to

pl.col("A_suffix").dot(pl.col("B_suffix")).alias("AdotB_suffix")

I know that expr.meta.root_names() gives back a list of the column names, but I could not find a way to transform them.

1

1 Answer 1

2

There is an example in the tests that does query plan node rewriting in Python with callbacks:

But I can't see any equivalent API for rewriting expressions?

Out of interest, there is .serialize() which can dump to JSON.

expr.meta.serialize(format="json")
# '{"Alias":[{"Agg":{"Sum":{"BinaryExpr":{"left":{"Column":"A"},"op":"Multiply","right":{"Column":"B"}}}}},"AdotB"]}'
#    ^^^^^                                         ^^^^^^^^^^                             ^^^^^^^^^^        ^^^^^

Technically, you could modify the Alias and Column values, and .deserialize() back into an expression.

def suffix_all(expr, suffix):
    def _add_suffix(obj):
        if "Column" in obj:
            obj["Column"] = obj["Column"] + suffix
        if "Alias" in obj:
            obj["Alias"][-1] +=  suffix
        return obj 
    ast = expr.meta.serialize(format="json")
    new_ast = json.loads(ast, object_hook=_add_suffix)

    return pl.Expr.deserialize(json.dumps(new_ast).encode(), format="json")
df = pl.DataFrame({"A_suffix": [2, 7, 3], "B_suffix": [10, 7, 1]})

expr = pl.col("A").dot(pl.col("B")).alias("AdotB")

df.with_columns(expr.pipe(suffix_all, "_suffix"))
shape: (3, 3)
┌──────────┬──────────┬──────────────┐
│ A_suffix ┆ B_suffix ┆ AdotB_suffix │
│ ---      ┆ ---      ┆ ---          │
│ i64      ┆ i64      ┆ i64          │
╞══════════╪══════════╪══════════════╡
│ 2        ┆ 10       ┆ 72           │
│ 7        ┆ 7        ┆ 72           │
│ 3        ┆ 1        ┆ 72           │
└──────────┴──────────┴──────────────┘

Which does seem to "work" in this case, but the serialize docs do contain a warning:

Serialization is not stable across Polars versions

And it's probably just not a recommended approach in general.

Sign up to request clarification or add additional context in comments.

Comments

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.