DuckDB query unexpectedly slow

Question

I have a DuckDB table with an int32 column type and a custom (Python) function type_str that converts the enum value to a human-readable string.

This query is very fast:

select type_str(type) as name, type, count(*) as count from objects
group by type 
having count > 1000
order by count desc;

which means the type_str function is not called for every row.

However, this query is very slow:

select type_str(type) as name, type, count(*) as count from objects
group by type 
having count > 1000 and name[0:3] = 'CAN'
order by count desc;

The documentation of HAVING says

The HAVING clause can be used after the GROUP BY clause to provide filter criteria after the grouping has been completed.

So I don't understand why this second query is much slower. It shouldn't have to do more work.

EXPLAIN suggests that it is running the UDF on all rows first: FILTER (array_slice(type_str(..)) -> FILTER (count There seem to be a couple of similar issues on Github github.com/duckdb/duckdb/issues?q=udf+filter - but not this specific case. You may need to ask DuckDB directly if you don't get a reply here. — jqurious
– jqurious, Commented Nov 14, 2024 at 18:09
Even putting the whole first query as a subquery and then selecting from that is slow. I may have to materialize it as a new table in order to make this fast — Petter
– Petter, Commented Nov 15, 2024 at 9:52

Petter · Accepted Answer · 2024-11-15 09:56:51Z

0

Here is a work-around:

create temp table type_count as
select type_str(type) as name, type, count(*) as count from objects
group by type 
order by count desc;

select name, type, count from type_count
where name[0:3] = 'CAN'
order by count desc;

This is fast

answered Nov 15, 2024 at 9:56

Petter

38.8k7 gold badges50 silver badges66 bronze badges

Sign up to request clarification or add additional context in comments.

Collectives™ on Stack Overflow

DuckDB query unexpectedly slow

1 Answer 1

Comments

Your Answer

Hot Network Questions

Collectives™ on Stack Overflow

1 Answer 1

Comments

Your Answer

Sign up or log in

Post as a guest

Related