Sum of products of columns in polars

Question

I have a dataset, part of which looks like this:

customer	product	price	quantity	sale_time
C060235	P0204	6.99	2	2024-03-11 08:24:11
C045298	P0167	14.99	1	2024-03-11 08:35:06
...
C039877	P0024	126.95	1	2024-09-30 21:18:45

What I want is a list of unique customer, product pairs with the total sales, so something like:

customer	product	total
C0000105	P0168	643.78
C0000105	P0204	76.88
...
C1029871	P1680	435.44

Here's my attempt at constructing this. This gives me the grand total of all sales, which isn't what I want. What's a correct approach?

import polars as pl

db.select(
    (
        pl.col('customer'),
        pl.col('product'),
        pl.col('quantity').mul(pl.col('price')).alias('total')
    )
).group_by(('customer', 'product'))

Can you please add the exact output that you get when you run that code — Starship
– Starship, Commented Mar 13 at 15:43

EuanG · Accepted Answer · 2025-03-14 09:51:20Z

3

To do this calculate the sale amount for each row then group by both customer and product columns, and then sum the calculated amounts within each group

Your current query has a few issues:

You're selecting product and customer but grouping by item_lookup_key and shopper_card_number
You need to use an aggregation function after grouping

This approach works:

db.group_by(["customer", "product"]).agg([
    ((pl.col("quantity") * pl.col("price")).sum()).alias("total")
])

A more concise alternative is the expr.dot:

db.group_by("customer", "product").agg(
    total=pl.col("quantity").dot("price")
)

edited Mar 14 at 9:51

answered Mar 13 at 15:42

EuanG

1,5551 gold badge14 silver badges33 bronze badges

Sign up to request clarification or add additional context in comments.

2 Comments

orlp Mar 13 at 18:47

You can write the alias a bit nicer too:

df.group_by(["customer", "product"]).agg(total = (pl.col("quantity") * pl.col("price")).sum()).sort(["customer", "product"])

BallpointBen Mar 13 at 20:31

I think expr.dot can be used here for even more concise code.

ticktalk · Accepted Answer · 2025-03-13 15:54:43Z

as you've not shown all the columns named in your example ie ('item_lookup_key', 'shopper_card_number'), here's a trivial one, that hopefully provides enough for you to progress

NB: am using polars 1.24.0 ! (linux mint 20.x)


cat wester.py
import polars as pl

# Sample dataset
data = {
    "customer": ["C060235", "C045298", "C039877", "C060235", "C039877"],
    "product": ["P0204", "P0167", "P0024", "P0204", "P0024"],
    "price": [6.99, 14.99, 126.95, 6.99, 126.95],
    "quantity": [2, 1, 1, 3, 2],
    "sale_time": [
        "2024-03-11 08:24:11",
        "2024-03-11 08:35:06",
        "2024-09-30 21:18:45",
        "2024-04-15 10:12:30",
        "2024-10-01 15:22:10",
    ],
}

df = pl.DataFrame(data)

# total sales by (customer, product)
result = (
    df.with_columns((pl.col("price") * pl.col("quantity")).alias("total_sales"))
    .group_by(["customer", "product"])
    .agg(pl.sum("total_sales").alias("total_sales"))
)

print(result)

#
python wester.py
shape: (3, 3)
┌──────────┬─────────┬─────────────┐
│ customer ┆ product ┆ total_sales │
│ ---      ┆ ---     ┆ ---         │
│ str      ┆ str     ┆ f64         │
╞══════════╪═════════╪═════════════╡
│ C039877  ┆ P0024   ┆ 380.85      │
│ C045298  ┆ P0167   ┆ 14.99       │
│ C060235  ┆ P0204   ┆ 34.95       │
└──────────┴─────────┴─────────────┘

My apologies, I was sloppy in naming the fields in the example. Fixed now.

Henry Harbeck · Accepted Answer · 2025-03-14 03:31:52Z

1

df.group_by("customer", "product").agg(total=pl.col("quantity").dot("price"))

Expr.dot computes the sum of the products (i.e., dot product). There is also no need for a list (square brackets) in both group_by and agg

answered Mar 14 at 3:31

Henry Harbeck

1,8632 silver badges14 bronze badges

Collectives™ on Stack Overflow

Sum of products of columns in polars

3 Answers 3

2 Comments

1 Comment

Comments

Your Answer

Hot Network Questions

Collectives™ on Stack Overflow

3 Answers 3

2 Comments

1 Comment

Comments

Your Answer

Sign up or log in

Post as a guest

Related