1

If I have a 1D array of bools, say np.array([True, False, True, True True, False, True, True]), I'd like a function that could return to me the indices of all contiguous regions of True.

For example, the output of this function called on the above array would produce something like [(0,0), (2,4), (6,7)].

I'm not sure how to accomplish this nicely, also I would like to be able to do this with PyTorch tensors as well.

4
  • Do you want only True, or False also? Commented Jul 25, 2021 at 1:19
  • 1
    It seems like it would be much more in keeping with the way Python works in other places to have half open intervals like [(0, 1), (2, 5), (6, 8)] which could then be used directly as ranges of slices. Commented Jul 25, 2021 at 1:35
  • @raunasur Probably don't need False just because I could easily invert the array and do the same calculation. Commented Jul 25, 2021 at 1:39
  • @Mark Good call, that might make it more user friendly. The solution I created doesn't do this but wouldn't be too hard to incorporate. Commented Jul 25, 2021 at 1:40

7 Answers 7

4

I've implemented this exact thing in a utility library I wrote called haggis. The function is haggis.npy_util.mask2runs. I wrote it in part to deal with this recurring question on Stack Overflow:

runs = haggis.math.mask2runs(mask)

The second column is exclusive indices, since that's more useful in both python and numpy, so you might want to do

runs[:, -1] -= 1

There's nothing special about this function. You can write it as a one-liner using numpy:

runs = numpy.flatnonzero(np.diff(numpy.r_[numpy.int8(0), mask.view(numpy.int8), numpy.int8(0)])).reshape(-1, 2)
Sign up to request clarification or add additional context in comments.

Comments

1

Here is a near one line, numpy only implementation.

Works for the input. I have not tested for edge cases but should work as well

Cheers if you need this for Advent of Code :)

import numpy as np

a = np.array([True, False, True, True, True, False, True, True])

#Indices are found here
a_true_ranges = np.argwhere(np.diff(a,prepend=False,append=False))

#Conversion into list of 2-tuples
a_true_ranges = a_true_ranges.reshape(len(a_true_ranges)//2,2)
a_true_ranges = [tuple(r) for r in a_true_ranges]
a_true_ranges
#[(0, 1), (2, 5), (6, 8)]

Comments

0

Here's one solution I found:

def find_contigs(mask):
    """
    Find contiguous regions in a mask that are True with no False in between
    """
    assert len(mask.shape) == 1 # 1D tensor of bools 
    
    contigs = []
    found_contig = False 
    for i,b in enumerate(mask):
        
        
        if b and not found_contig:   # found the beginning of a contig
            contig = [i]
            found_contig = True 
        
        elif b and found_contig:     # currently have contig, continuing it 
            pass 
        
        elif not b and found_contig: # found the end, record previous index as end, reset indicator  
            contig.append(i-1)
            found_contig = False 
            contigs.append(tuple(contig))
        
        else:                        # currently don't have a contig, and didn't find one 
            pass 
    
    
    # fence post bug - check if the very last entry was True and we didn't get to finish 
    if b:
        contig.append(i)
        found_contig = False 
        contigs.append(tuple(contig))
        
    return contigs

1 Comment

Where did you find it?
0

I'm not sure if there's a good purely Numpy approach to this, but since it looks like you're willing to loop, you could just use itertools.groupby and keep track of the index and group length:

from itertools import groupby

a = np.array([True, False, True, True, True, False, True, True])

i = 0
res = []

for k, g in groupby(a):
    l = len(list(g))
    if k:
        res.append((i,i+l))
    i += l

print(res)
# [(0, 1), (2, 5), (6, 8)]

If you want the closed intervals, obviously you can just subtract 1 from the second tuple value.

You could also write it as a generator which can be a little friendlier on the memory with long lists:

from itertools import groupby

a = np.array([True, False, True, True, True, False, True, True])

def true_indices(a):
    i = 0
    for k, g in groupby(a):
        l = len(list(g))
        if k:
            yield (i,i+l)
        i += l

list(true_indices(a))
# [(0, 1), (2, 5), (6, 8)]

Comments

0

I'm not too familiar with Pytorch but the problem itself looks easily solvable. You're going to want to loop over the whole list and store the first index of a contingency region where one is found. I've used -1 since it's out of bounds of any index. From there you just append a tuple to a list with the saved index and current index and reset the start index to -1. Since this relies on a False boolean being found, I've added a final check outside the loop for a non-reset start index which implies a contingency region from there to the end of the list.

I said current index but to keep it inbound I subtracted 1 to make it more in line with what you wanted.

def getConRegions(booleanList):
    conRegions=[]
    start=-1
    for index,item in enumerate(booleanList):
        if item and start<0:start=index
        elif start>=0 and not item:
            conRegions.append((start,index-1))
            start=-1
    if(start>=0):conRegions.append((start,len(booleanList)-1))
    return conRegions
print(getConRegions([True,False,True,True,True,False,True,True]))

Comments

0

My humble solution:

list = [True, False, True, True, True, False, True, True]
if list[-1] == True:
    list.extend([False])

answers = []
beginning = "yes"

for index, value in enumerate(list):
    if value == True and beginning == "yes":
        x = index
        beginning = "not anymore"
    elif value == True:
        pass
    else:
        answers.append((x, index -1))
        beginning = "yes"
        
print(answers)

Output:

[(0, 0), (2, 4), (6, 7)]

Comments

0

Fast, numpy-based solution:

import numpy as np

def find_contigs(arr):
    indices = []
    # Each number in this array will be the index of an
    # element in 'arr' just *before* it switches
    nonzeros = np.nonzero(np.diff(arr))[0]

    # Case if array begins with True
    if arr[0]:
        if len(nonzeros):
            indices.append((0, nonzeros[0] + 1))
        else:
            indices.append((0, len(arr)))
        nonzeros = nonzeros[1:]

    # Correct for array ending in True
    if len(nonzeros) % 2:
        final_idx = (nonzeros[-1] + 1, len(arr))
        nonzeros = nonzeros[:-1]
    else:
        final_idx = None

    # Parse nonzero indices
    nonzeros += 1
    nonzeros = nonzeros.reshape((len(nonzeros)//2, 2))
    indices.extend([(a, b) for a, b in zip(nonzeros[:, 0], nonzeros[:, 1])])
    if final_idx is not None:
        indices.append(final_idx)

    return indices

Comments

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.