22

I have a table in PostgreSQL 9.2 that looks like this (simplified):

CREATE TABLE my_features
(
  id integer NOT NULL,
  feature_id integer NOT NULL,
  begin_time timestamp NOT NULL,
  end_time timestamp
)

For each feature_id there may be multiple rows with time ranges specified by begin_time/end_time. They may overlap, but this is relatively rare. I'm looking for a fast way to find all feature_ids that have/don't have any overlaps.

I tried to do this using window functions, like this:

SELECT feature_id, bool_or(end_time > lead(begin_time) OVER ts_win) OVER ts_win AS overlaps_any
FROM my_features
WINDOW ts_win AS (PARTITION BY feature_id ORDER BY begin_time)

... but this doesn't work:

ERROR:  window function calls cannot be nested

The algorithm is simple: order the rows for a given feature_id by begin_time and check if any end_time > the next begin_time (if any). I suspect there must be an easy way to do this, perhaps with tsrange functions, but can't seem to find it just now.

2 Answers 2

41

This can indeed be done using range types.

The following selects all those rows that do have overlapping ranges:

select f1.*
from my_features f1
where exists (select 1
              from my_features f2
              where tsrange(f2.begin_time, f2.end_time, '[]') && tsrange(f1.begin_time, f1.end_time, '[]')
                and f2.feature_id = f1.feature_id
                and f2.id <> f1.id);

When you change the condition to NOT EXISTS you'll find those that don't have any overlapping ranges.

SQLFiddle example: http://sqlfiddle.com/#!15/40b1e/1

tsrange(f2.begin_time, f2.end_time, '[]') creates a range that includes the upper and lower bounds. You can also create ranges that exclude either one or both.

More details can be found in the manual:
http://www.postgresql.org/docs/current/static/rangetypes.html#RANGETYPES-INCLUSIVITY

The && operator checks if the two ranges overlap: http://www.postgresql.org/docs/current/static/functions-range.html

(I just wish Oracle had something fancy like that...)

Sign up to request clarification or add additional context in comments.

3 Comments

Thanks, that works! Looks simple now that I see it. :)
Awesome ! thanks a lot it save me days, i'm just discovering postgre and that kind of cool stuff !
Thank you so much, this saved me today.
4

Here is an observation. If there are overlapping time periods for a feature, then at least one time period overlaps with the preceding one as defined by begin_time. (You can look at this the other way. If there are no such overlaps then there is always a gap between one time frame and the next and nothing overlaps.)

This leads to the following query for determining overlaps:

select f.feature_id
from (select f.feature_id,
             (case when lag(end_time) over (partition by feature_id order by begin_time) > begin_time
                   then 1 else 0
              end) as HasOverlap
      from my_features f
     ) f
group by f.feature_id
having max(HaxOverlap) = 1;

1 Comment

Thanks, this works and it's basically what I was trying to do in my original post. I removed the "case when" and just used bool_or() on the result of the comparison.

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.