Finding contiguous regions in a 1D boolean array

Question

If I have a 1D array of bools, say np.array([True, False, True, True True, False, True, True]), I'd like a function that could return to me the indices of all contiguous regions of True.

For example, the output of this function called on the above array would produce something like [(0,0), (2,4), (6,7)].

I'm not sure how to accomplish this nicely, also I would like to be able to do this with PyTorch tensors as well.

It seems like it would be much more in keeping with the way Python works in other places to have half open intervals like [(0, 1), (2, 5), (6, 8)] which could then be used directly as ranges of slices. — Mark
– Mark, Commented Jul 25, 2021 at 1:35
@raunasur Probably don't need False just because I could easily invert the array and do the same calculation. — David Juergens
– David Juergens, Commented Jul 25, 2021 at 1:39
@Mark Good call, that might make it more user friendly. The solution I created doesn't do this but wouldn't be too hard to incorporate. — David Juergens
– David Juergens, Commented Jul 25, 2021 at 1:40

Mad Physicist · Accepted Answer · 2022-08-12 15:16:14Z

4

I've implemented this exact thing in a utility library I wrote called haggis. The function is haggis.npy_util.mask2runs. I wrote it in part to deal with this recurring question on Stack Overflow:

runs = haggis.math.mask2runs(mask)

The second column is exclusive indices, since that's more useful in both python and numpy, so you might want to do

runs[:, -1] -= 1

There's nothing special about this function. You can write it as a one-liner using numpy:

runs = numpy.flatnonzero(np.diff(numpy.r_[numpy.int8(0), mask.view(numpy.int8), numpy.int8(0)])).reshape(-1, 2)

edited Aug 12, 2022 at 15:16

answered Jul 25, 2021 at 2:08

Mad Physicist

116k29 gold badges202 silver badges292 bronze badges

Sign up to request clarification or add additional context in comments.

Comments

UniCycle958 · Accepted Answer · 2023-12-03 10:49:01Z

1

Here is a near one line, numpy only implementation.

Works for the input. I have not tested for edge cases but should work as well

Cheers if you need this for Advent of Code :)

import numpy as np

a = np.array([True, False, True, True, True, False, True, True])

#Indices are found here
a_true_ranges = np.argwhere(np.diff(a,prepend=False,append=False))

#Conversion into list of 2-tuples
a_true_ranges = a_true_ranges.reshape(len(a_true_ranges)//2,2)
a_true_ranges = [tuple(r) for r in a_true_ranges]
a_true_ranges
#[(0, 1), (2, 5), (6, 8)]

answered Dec 3, 2023 at 10:49

UniCycle958

113 bronze badges

Comments

David Juergens · Accepted Answer · 2021-07-25 01:37:44Z

0

Here's one solution I found:

def find_contigs(mask):
    """
    Find contiguous regions in a mask that are True with no False in between
    """
    assert len(mask.shape) == 1 # 1D tensor of bools 
    
    contigs = []
    found_contig = False 
    for i,b in enumerate(mask):
        
        
        if b and not found_contig:   # found the beginning of a contig
            contig = [i]
            found_contig = True 
        
        elif b and found_contig:     # currently have contig, continuing it 
            pass 
        
        elif not b and found_contig: # found the end, record previous index as end, reset indicator  
            contig.append(i-1)
            found_contig = False 
            contigs.append(tuple(contig))
        
        else:                        # currently don't have a contig, and didn't find one 
            pass 
    
    
    # fence post bug - check if the very last entry was True and we didn't get to finish 
    if b:
        contig.append(i)
        found_contig = False 
        contigs.append(tuple(contig))
        
    return contigs

answered Jul 25, 2021 at 1:37

David Juergens

676 bronze badges

1 Comment

Mad Physicist Over a year ago

Where did you find it?

Mark · Accepted Answer · 2021-07-25 01:51:57Z

I'm not sure if there's a good purely Numpy approach to this, but since it looks like you're willing to loop, you could just use itertools.groupby and keep track of the index and group length:

from itertools import groupby

a = np.array([True, False, True, True, True, False, True, True])

i = 0
res = []

for k, g in groupby(a):
    l = len(list(g))
    if k:
        res.append((i,i+l))
    i += l

print(res)
# [(0, 1), (2, 5), (6, 8)]

If you want the closed intervals, obviously you can just subtract 1 from the second tuple value.

You could also write it as a generator which can be a little friendlier on the memory with long lists:

from itertools import groupby

a = np.array([True, False, True, True, True, False, True, True])

def true_indices(a):
    i = 0
    for k, g in groupby(a):
        l = len(list(g))
        if k:
            yield (i,i+l)
        i += l

list(true_indices(a))
# [(0, 1), (2, 5), (6, 8)]

Dharman · Accepted Answer · 2021-07-25 02:28:53Z

I'm not too familiar with Pytorch but the problem itself looks easily solvable. You're going to want to loop over the whole list and store the first index of a contingency region where one is found. I've used -1 since it's out of bounds of any index. From there you just append a tuple to a list with the saved index and current index and reset the start index to -1. Since this relies on a False boolean being found, I've added a final check outside the loop for a non-reset start index which implies a contingency region from there to the end of the list.

I said current index but to keep it inbound I subtracted 1 to make it more in line with what you wanted.

def getConRegions(booleanList):
    conRegions=[]
    start=-1
    for index,item in enumerate(booleanList):
        if item and start<0:start=index
        elif start>=0 and not item:
            conRegions.append((start,index-1))
            start=-1
    if(start>=0):conRegions.append((start,len(booleanList)-1))
    return conRegions
print(getConRegions([True,False,True,True,True,False,True,True]))

thenarfer · Accepted Answer · 2021-07-25 03:28:54Z

0

My humble solution:

list = [True, False, True, True, True, False, True, True]
if list[-1] == True:
    list.extend([False])

answers = []
beginning = "yes"

for index, value in enumerate(list):
    if value == True and beginning == "yes":
        x = index
        beginning = "not anymore"
    elif value == True:
        pass
    else:
        answers.append((x, index -1))
        beginning = "yes"
        
print(answers)

Output:

[(0, 0), (2, 4), (6, 7)]

answered Jul 25, 2021 at 3:28

thenarfer

4553 silver badges14 bronze badges

Comments

Isarcus · Accepted Answer · 2022-09-24 16:02:34Z

Fast, numpy-based solution:

import numpy as np

def find_contigs(arr):
    indices = []
    # Each number in this array will be the index of an
    # element in 'arr' just *before* it switches
    nonzeros = np.nonzero(np.diff(arr))[0]

    # Case if array begins with True
    if arr[0]:
        if len(nonzeros):
            indices.append((0, nonzeros[0] + 1))
        else:
            indices.append((0, len(arr)))
        nonzeros = nonzeros[1:]

    # Correct for array ending in True
    if len(nonzeros) % 2:
        final_idx = (nonzeros[-1] + 1, len(arr))
        nonzeros = nonzeros[:-1]
    else:
        final_idx = None

    # Parse nonzero indices
    nonzeros += 1
    nonzeros = nonzeros.reshape((len(nonzeros)//2, 2))
    indices.extend([(a, b) for a, b in zip(nonzeros[:, 0], nonzeros[:, 1])])
    if final_idx is not None:
        indices.append(final_idx)

    return indices

Collectives™ on Stack Overflow

Finding contiguous regions in a 1D boolean array

7 Answers 7

Comments

Comments

1 Comment

Comments

Comments

Comments

Fast, numpy-based solution:

Comments

Your Answer

Linked

Hot Network Questions

Collectives™ on Stack Overflow

7 Answers 7

Comments

Comments

1 Comment

Comments

Comments

Comments

Fast, numpy-based solution:

Comments

Your Answer

Sign up or log in

Post as a guest

Linked

Related