4

I am moving from Oracle to Postgresql. I am trying to convert some Oracle hierarchical queries to Postgres. For example, in Oracle to return a comma-delimited ordered list of all ids under (i.e., the children) and including the id_to_start_with I would do the following:

SELECT LISTAGG(id_something, ',') WITHIN GROUP (ORDER BY id_something) AS somethings FROM(
SELECT DISTINCT D.id_something
FROM something_table D
START WITH D.id_something = :id_to_start_with
CONNECT BY D.id_something_parent = PRIOR D.id_something
)

The equivalent in Postgres would seem to be:

WITH RECURSIVE the_somethings(id_something) AS (
SELECT id_something
FROM something_table
WHERE id_something = $id_to_start_with
UNION ALL
SELECT D.id_something
FROM the_somethings DR
JOIN something_table D ON DR.id_something = D.id_something_parent
)
SELECT string_agg(temp_somethings.id_something::TEXT, ',') AS somethings
FROM (
SELECT id_something
FROM the_somethings
ORDER BY id_something
) AS temp_somethings

Likewise if I want to return a comma-delimited ordered list of all ids above (i.e., the parents) and including the id_to_start_with I would do the following in Oracle:

SELECT LISTAGG(id_something, ',') WITHIN GROUP (ORDER BY id_something) AS somethings FROM(
SELECT DISTINCT D.id_something
FROM something_table D
START WITH D.id_something = :id_to_start_with
CONNECT BY D.id_something = PRIOR D.id_something_parent
)

The equivalent in Postgres would seem to be:

WITH RECURSIVE the_somethings(id_something, path) AS (
SELECT id_something
, id_something::TEXT as path
FROM something_table
WHERE id_something_parent IS NULL
UNION ALL
SELECT D.id_something
, (DR.path || ',' || D.id_something::TEXT) as path
FROM something_table D
JOIN the_somethings DR ON DR.id_something = D.id_something_parent
)
SELECT path
FROM the_somethings
WHERE id_something = $id_to_start_with
ORDER BY id_something

My question has to do with the last Postgres query. It seems terribly inefficient to me and I wonder if there is a better way to write it. That is, in Oracle the query will look for the parent of the id_to_start_with, then the parent of the parent, and so forth to the root.

The Postgres query, on the other hand, gets every single root to child path combination possible and then throws everything away except for the one root to id_to_start_with that I am looking for. That is potentially a ton of data to create just to throw it all away except for the one row I am looking for.

Is there a way to get a comma-delimited ordered list of all the parents of a particular id_to_start_with that is as performant in Postgres as it is in Oracle?

Edit: Adding explain plans from Oracle and Postgres.

Oracle Explain Plan Output Oracle Plan ScreenShot

Postgres Explain Analyyze Output

CTE Scan on the_somethings  (cost=62.27..74.66 rows=3 width=76) (actual time=0.361..0.572 rows=1 loops=1)
  Filter: (id_something = 1047)
  Rows Removed by Filter: 82
  CTE the_somethings
    ->  Recursive Union  (cost=0.00..62.27 rows=551 width=76) (actual time=0.026..0.433 rows=83 loops=1)
          ->  Seq Scan on something_table  (cost=0.00..2.83 rows=1 width=8) (actual time=0.023..0.034 rows=1 loops=1)
                Filter: (id_something_parent IS NULL)
                Rows Removed by Filter: 82
          ->  Hash Join  (cost=0.33..4.84 rows=55 width=76) (actual time=0.028..0.065 rows=16 loops=5)
                Hash Cond: (d.id_something_parent = dr.id_something)
                ->  Seq Scan on something_table d  (cost=0.00..2.83 rows=83 width=16) (actual time=0.002..0.012 rows=83 loops=5)
                ->  Hash  (cost=0.20..0.20 rows=10 width=76) (actual time=0.009..0.009 rows=17 loops=5)
                      Buckets: 1024  Batches: 1  Memory Usage: 10kB
                      ->  WorkTable Scan on the_somethings dr  (cost=0.00..0.20 rows=10 width=76) (actual time=0.001..0.004 rows=17 loops=5)
Planning time: 0.407 ms
Execution time: 0.652 ms

This is the final query based on Jakub's answer below.

WITH RECURSIVE the_somethings(id_something, path, level, orig_id, id_something_parent) AS (

SELECT id_something

, id_something::TEXT as path

, 0 as level

, id_something AS orig_id

, id_something_parent

FROM something_table

WHERE id_something IN (1047, 448)

UNION ALL

SELECT D.id_something

, (D.id_something::TEXT || ',' || DR.path) as path

, DR.level + 1 as level

, DR.orig_id as orig_id

, D.id_something_parent

FROM something_table D

JOIN the_somethings DR ON D.id_something = DR.id_something_parent

)

SELECT DISTINCT ON(orig_id) orig_id, path

FROM the_somethings

ORDER BY orig_id, level DESC
;
8
  • Just reverse the join e.g. FROM the_somethings DR JOIN something_table D and it will look like what you've described. But I don't think that will change the query plan, I think the planner already do what you want. Commented May 5, 2016 at 18:15
  • Maybe I don't know how to read a Postgres explain plan yet, but when I see a line like: Recursive Union (cost=0.00..62.27 rows=551 width=76) I assume it is working with 551 rows. When I look at the Oracle explain plan it is only working with a maximum of 4 rows of data (the id_to_start_with and it's parent rows). Commented May 5, 2016 at 18:32
  • 2
    That is just the plain EXPLAIN (which shows estimated info). Run EXPLAIN ANALYZE for real data usage. Commented May 5, 2016 at 20:23
  • 1
    Your query takes less then one millisecond. How fast do you need it to be? Commented May 5, 2016 at 21:57
  • 1
    This is just a test with just a few hundred records. Surely there is a way to have it only look at the parents and not at all relationships when dealing with a huge hierarchy. I'm not trying to do premature optimization. I just want it to work like how I understand Oracle already works. Commented May 5, 2016 at 22:00

1 Answer 1

1

CTEs in PostgreSQL are fenced meaning they will be materialized and only then will the filter from outer query will be applied. To make the query perform correctly build it the other way around and put the filter inside the CTE.

WITH RECURSIVE the_somethings(id_something, path) AS (
SELECT id_something
, id_something::TEXT as path, 0 as level, id_something AS orig_id
FROM something_table
WHERE id_something IN ($id_to_start_with,$id_to_start_with2)
UNION ALL
SELECT D.id_something
, (D.id_something::TEXT || ',' || DR.path) as path, DR.level + 1, DR.orig_id
FROM something_table D
JOIN the_somethings DR ON DR.id_something_parent = D.id_something
)
SELECT DISTINCT ON(orig_id) orig_id, path
FROM the_somethings
ORDER BY orig_id, DR.level DESC
Sign up to request clarification or add additional context in comments.

Comments

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.