4
\$\begingroup\$

I try to solve the following problem using Python:
Given a 1-dimensional numeric array arr, find all indices where the value of arr changes. Return for each pair (a, b) (where a != b) an iterator of the indices i where arr[i] == a and arr[i+1] == b.

Example:
If the input array was [1,1,1,1,1,1,5,5,5,5,5,5,5,5,5,5,5,1,1,1,1,1,1,4,4,4,4,4,4,4,4,4,1,1,1,1,1,1], then the function should return the following pairs and iterators:

(1, 5): [6],
(5, 1): [17],
(1, 4): [23],
(4, 1): [32]

If the input is [1,1,1,1,4,4,4,4,1,1,4,4,4,4,4,4,3,3,3,1,4,4,4], then the function should return:

(1, 4): [4, 10, 20],
(4, 1): [8],
(4, 3): [16],
(3, 1): [19]

In my current implementation, for each pair a list is returned, but it may be any iterator. All lists are contained in a dictionary (a defaultdict, to be precise), but they may be returned in any style.

This is my current implementation:

from collections import defaultdict

def find_changepoints(arr):
    changepoints = defaultdict(list)
    current_value = arr[0]
    for (index, value) in enumerate(arr):
        if value != current_value:
            changepoints[ (current_value, value) ].append(index)
            current_value = value
    
    return changepoints
\$\endgroup\$

2 Answers 2

1
\$\begingroup\$

Minor feedback about your existing solution: add PEP484 type hints; enumerate(arr[1:]) instead of over the whole array; no need to enclose your tuple index in parens; add unit tests.

If this is homework, a programming challenge or whatever and exists in a vacuum, you've basically met the brief. If this is for a specific application and you can replace the dictionary representation with something more efficient, then that will benefit you. For example, the following implementation has a dictionary wrapper that mimics yours around a vectorised function. If you have an application that is able to use the vectorised function directly, it will be faster for large arrays.

from collections import defaultdict
from numbers import Real
from typing import Sequence

import numpy as np


def changed_where_numpy(in_seq: np.ndarray) -> tuple[
    np.ndarray,  # old values
    np.ndarray,  # new values
    np.ndarray,  # new indices
]:
    old_idx, = np.diff(in_seq).nonzero()
    new_idx = old_idx + 1
    return in_seq[old_idx], in_seq[new_idx], new_idx


def changed_where(in_seq: Sequence[Real]) -> dict[
    tuple[Real, Real],  # old and new value
    list[int],          # new value indices
]:
    old_vals, new_vals, new_idx = changed_where_numpy(np.array(in_seq))
    result = defaultdict(list)
    for old_val, new_val, new_idx in zip(old_vals, new_vals, new_idx):
        result[old_val, new_val].append(new_idx)
    return result


def test():
    in_seq = (
        1, 1, 1, 1, 1, 1,
        5, 5, 5, 5, 5, 5, 5, 5, 5, 5, 5,
        1, 1, 1, 1, 1, 1,
        4, 4, 4, 4, 4, 4, 4, 4, 4,
        1, 1, 1, 1, 1, 1,
    )
    expected = {
        (1, 5): [6],
        (5, 1): [17],
        (1, 4): [23],
        (4, 1): [32],
    }
    actual = changed_where(in_seq)
    assert expected == actual

    in_seq = (
        1, 1, 1, 1,
        4, 4, 4, 4,
        1, 1,
        4, 4, 4, 4, 4, 4,
        3, 3, 3,
        1,
        4, 4, 4,
    )
    expected = {
        (1, 4): [4, 10, 20],
        (4, 1): [8],
        (4, 3): [16],
        (3, 1): [19]
    }
    actual = changed_where(in_seq)
    assert expected == actual


if __name__ == '__main__':
    test()
\$\endgroup\$
1
\$\begingroup\$

Return for each pair (a, b) (where a != b) an iterator of the indices i where arr[i] == a and arr[i+1] == b.

You failed the task, since your function is not returning an iterator of indices. Also, if you read your task carefully, your indices are off by 1.

from itertools import islice
from typing import Iterator


def changes(lst: list) -> Iterator[int]:
    for index, (current, next) in enumerate(zip(lst, islice(lst, 1, None))):
        if current != next:
            yield index
\$\endgroup\$
3
  • \$\begingroup\$ Thank you for pointing out my misconception. I mixed up the words iterable and iterator when I formulated the task. I will not edit my initial question as to not make your answer invalid. \$\endgroup\$ Commented Mar 22, 2022 at 8:20
  • \$\begingroup\$ I see. Given your question, I assumed that the task was given to you by a third party as cited by you. \$\endgroup\$ Commented Mar 22, 2022 at 13:09
  • \$\begingroup\$ This solution seems to return a single iterator of all changes in the whole input list, whilst the task is (if I get it correctly) to return a separate iterator for each distinct pair of values at the constant runs' boundaries. \$\endgroup\$ Commented Dec 12, 2023 at 7:55

You must log in to answer this question.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.