The first step would be to check if your task can be solved natively using Polars Expressions.
If a custom function is neccessary, .map_elements() can be used to apply one on a row by row basis.
To pass in values from multiple columns, you can utilize the Struct data type.
e.g. with pl.struct()
>>> df.select(pl.struct(pl.all())) # all columns
shape: (3, 1)
┌───────────┐
│ foo │
│ --- │
│ struct[3] │
╞═══════════╡
│ {1,4,7} │
│ {2,5,8} │
│ {3,6,9} │
└───────────┘
Using pl.struct(...).map_elements will pass the values to the custom function as a dict argument.
def my_complicated_function(row: dict) -> int:
"""
A function that cannot utilize polars expressions.
This should be avoided.
"""
# a dict with column names as keys
print(f"[DEBUG]: {row=}")
# do some work
return row["foo"] + row["bar"] + row["baz"]
df = pl.DataFrame({
"foo": [1, 2, 3],
"bar": [4, 5, 6],
"baz": [7, 8, 9]
})
df = df.with_columns(
pl.struct(pl.all())
.map_elements(my_complicated_function, return_dtype=pl.Int64)
.alias("foo + bar + baz")
)
# [DEBUG]: row={'foo': 1, 'bar': 4, 'baz': 7}
# [DEBUG]: row={'foo': 2, 'bar': 5, 'baz': 8}
# [DEBUG]: row={'foo': 3, 'bar': 6, 'baz': 9}
shape: (3, 4)
┌─────┬─────┬─────┬─────────────────┐
│ foo ┆ bar ┆ baz ┆ foo + bar + baz │
│ --- ┆ --- ┆ --- ┆ --- │
│ i64 ┆ i64 ┆ i64 ┆ i64 │
╞═════╪═════╪═════╪═════════════════╡
│ 1 ┆ 4 ┆ 7 ┆ 12 │
│ 2 ┆ 5 ┆ 8 ┆ 15 │
│ 3 ┆ 6 ┆ 9 ┆ 18 │
└─────┴─────┴─────┴─────────────────┘