With Postgres it's quite easy to prevent this by collecting all visited nodes in an array.
Setup:
create table hierarchy (id integer, parent_id integer);
insert into hierarchy
values
(1, null), -- root element
(2, 1), -- first child
(3, 1), -- second child
(4, 3),
(5, 4),
(3, 5); -- endless loop
Graph represented:
1 --+--> 2
|
+--> 3 --> 4 --> 5 --+
^ |
| |
+---------------+
Recursive query:
with recursive tree as (
select id,
parent_id,
array[id] as all_parents
from hierarchy
where parent_id is null
union all
select c.id,
c.parent_id,
p.all_parents||c.id
from hierarchy c
join tree p
on c.parent_id = p.id
and c.id <> ALL (p.all_parents) -- this is the trick to exclude the endless loops
)
select *
from tree;
Output:
id | parent_id | all_parents
----+-----------+-------------
1 | | {1}
2 | 1 | {1,2}
3 | 1 | {1,3}
4 | 3 | {1,3,4}
5 | 4 | {1,3,4,5}
(5 rows)
To be able to detect if a loop exists or not, you can do:
with recursive tree(id, parent_id, depth, is_cycle, all_parents) as (
select id,
parent_id,
0,
false,
array[id] as all_parents
from hierarchy
where parent_id is null
union all
select c.id,
c.parent_id,
p.depth + 1,
c.id = any(all_parents),
p.all_parents||c.id
from hierarchy c
join tree p
on c.parent_id = p.id
and not is_cycle
)
select *
from tree
where is_cycle;
which outputs:
id | parent_id | depth | is_cycle | all_parents
----+-----------+-------+----------+-------------
3 | 5 | 4 | t | {1,3,4,5,3}
(1 row)
Because there was one row in the output, this means that there was an infinite cycle and this would have been its first step.
To do this for multiple trees at the same time, you need to carry over the ID of the root node to the children:
with recursive tree as (
select id,
parent_id,
array[id] as all_parents,
id as root_id
from hierarchy
where parent_id is null
union all
select c.id,
c.parent_id,
p.all_parents||c.id,
p.root_id
from hierarchy c
join tree p
on c.parent_id = p.id
and c.id <> ALL (p.all_parents) -- this is the trick to exclude the endless loops
and c.root_id = p.root_id
)
select *
from tree;
Update for Postgres 14
Postgres 14 introduced the (standard compliant) CYCLE option to detect cycles:
with recursive tree as (
select id,
parent_id
from hierarchy
where parent_id is null
union all
select c.id,
c.parent_id
from hierarchy c
join tree p
on c.parent_id = p.id
)
cycle id -- track cycles for this column
set is_cycle -- adds a boolean column is_cycle
using path -- adds a column that contains all parents for the id
select *
from tree
where not is_cycle;
Output:
id | parent_id | is_cycle | path
----+-----------+----------+-------------------
1 | | f | {(1)}
2 | 1 | f | {(1),(2)}
3 | 1 | f | {(1),(3)}
4 | 3 | f | {(1),(3),(4)}
5 | 4 | f | {(1),(3),(4),(5)}
(5 rows)
To detect if a cycle is present, we can just replace:
where not is_cycle
with:
where is_cycle
which outputs:
id | parent_id | is_cycle | path
----+-----------+----------+-----------------------
3 | 5 | t | {(1),(3),(4),(5),(3)}
(1 row)
As mentioned in the documentation, this syntax is a shortcut equivalent to adding an is_cycle and path array rows manually.