Is there a datastructure that will maintain a unique set of ranges, merging an contiguous or overlapping ranges that are added? I need to track which ranges have been processed, but this may occur in an arbitrary order. E.g.:
range_set = RangeSet() # doesn't exist that I know of, this is what I need help with
def process_data(start, end):
global range_set
range_set.add_range(start, end)
# ...
process_data(0, 10)
process_data(20, 30)
process_data(5, 15)
process_data(50, 60)
print(range_set.missing_ranges())
# [[16,19], [31, 49]]
print(range_set.ranges())
# [[0,15], [20,30], [50, 60]]
Notice that overlapping or contiguous ranges get merged together. What is the best way to do this? I looked at using the bisect module, but its use didn't seem terribly clear.
Interval Treecan help you with this - it allows range-based queries and insertion. There are multiple packages as far as I know, e.g. pypi.python.org/pypi/intervaltree/2.0.4Interval Treeis too complex, writing a class probably wouldn't be too difficult.SortedSetwith a specialOverlappingIntervalsUpdatorfor some algorithms e.g., Python - Removing overlapping lists. Somewhat related: to merge discrete sets, you could use "disjoint set" (aka UnionFind structure) or connected components (in a graph).