10

Is there a pythonic way to select N consecutive elements from a list or numpy array.

So Suppose:

Choice = [1,2,3,4,5,6] 

I would like to create a new list of length N by randomly selecting element X in Choice along with the N-1 consecutive elements following choice.

So if:

X = 4 
N = 4

The resulting list would be:

Selection = [5,6,1,2] 

I think something similar to the following would work.

S = [] 
for i in range(X,X+N):
    S.append(Selection[i%6])    

But I was wondering if there is a python or numpy function that can select the elements at once that was more efficient.

8
  • 2
    why not just randomly choose a starting index from [0, len(choices) - N)? Commented Jan 27, 2021 at 2:23
  • 1
    Check out the slice notation: my_list[1:3]. You will need to figure out the logic if you consider the list circular. Commented Jan 27, 2021 at 2:23
  • 2
    Does this answer your question? Cycle through list starting at a certain element Commented Jan 27, 2021 at 2:26
  • 1
    Can N be bigger than len(Choice)? Commented Jan 27, 2021 at 6:38
  • 1
    @phntm yes, if N <= len(Choice) the answer can be much simpler. See the edit to my answer Commented Jan 27, 2021 at 8:23

5 Answers 5

10

Use itertools, specifically islice and cycle.

start = random.randint(0, len(Choice) - 1)
list(islice(cycle(Choice), start, start + n))

cycle(Choice) is an infinite sequence that repeats your original list, so that the slice start:start + n will wrap if necessary.

Sign up to request clarification or add additional context in comments.

2 Comments

This is awesome thanks! Is using itertools generally faster than numpy?
@phntm: For large inputs, it's almost guaranteed to be slower than any other solution, as it's O(n) on both memory usage and processing time (islice can't skip values, cycle stores them all until the input is exhausted). For small inputs, it hardly matters what solution you use.
4

You could use a list comprehension, using modulo operations on the index to keep it in range of the list:

Choice = [1,2,3,4,5,6] 
X = 4 
N = 4
L = len(Choice)
Selection = [Choice[i % L] for i in range(X, X+N)]
print(Selection)

Output

[5, 6, 1, 2]

Note that if N is less than or equal to len(Choice), you can greatly simplify the code:

Choice = [1,2,3,4,5,6] 
X = 4 
N = 4
L = len(Choice)
Selection = Choice[X:X+N] if X+N <= L else Choice[X:] + Choice[:X+N-L]
print(Selection)

Comments

3

Since you are asking for the most efficient way I created a little benchmark to test the solutions proposed in this thread.

I rewrote your current solution as:

def op(choice, x):
    n = len(choice)
    selection = []
    for i in range(x, x + n):
        selection.append(choice[i % n])
    return selection

Where choice is the input list and x is the random index.

These are the results if choice contains 1_000_000 random numbers:

chepner: 0.10840400000000017 s
nick: 0.2066781999999998 s
op: 0.25887470000000024 s
fountainhead: 0.3679908000000003 s

Full code

import random
from itertools import cycle, islice
from time import perf_counter as pc
import numpy as np


def op(choice, x):
    n = len(choice)
    selection = []
    for i in range(x, x + n):
        selection.append(choice[i % n])
    return selection


def nick(choice, x):
    n = len(choice)
    return [choice[i % n] for i in range(x, x + n)]


def fountainhead(choice, x):
    n = len(choice)
    return np.take(choice, range(x, x + n), mode='wrap')


def chepner(choice, x):
    n = len(choice)
    return list(islice(cycle(choice), x, x + n))


results = []
n = 1_000_000
choice = random.sample(range(n), n)
x = random.randint(0, n - 1)

# Correctness
assert op(choice, x) == nick(choice,x) == chepner(choice,x) == list(fountainhead(choice,x))

# Benchmark
for f in op, nick, chepner, fountainhead:
    t0 = pc()
    f(choice, x)
    t1 = pc()
    results.append((t1 - t0, f))

for t, f in sorted(results):
    print(f'{f.__name__}: {t} s')

3 Comments

Thanks for taking the time - it's always interesting to see performance results.
Interestingly on my computer, for 1,000,000 entries the advantage of chepner is only about 15-20%, and it decreases further as the list length gets longer. At shorter lengths (10k or less), chepner is more than 2x faster.
@Nick yes chepner solution seems the fastest for a list. For larger list size (>10^6) using an array as input (like array('i', choice)) is faster and takes less memory. If choice is a Numpy array then this solution is the fastest regardless of the input size.
3

If using a numpy array as the source, we could of course use numpy "fancy indexing".

So, if ChoiceArray is the numpy array equivalent of the list Choice, and if L is len(Choice) or len(ChoiceArray):

Selection = ChoiceArray [np.arange(X, N+X) % L]

Comments

2

Here's a numpy approach:

import numpy as np

Selection = np.take(Choice, range(X,N+X), mode='wrap')

Works even if Choice is a Python list rather than a numpy array.

2 Comments

Thanks this is really helpful! Is there a difference in terms of using np.random.default_rng vs np.random.choice?
@phntm - Yes there are differences. I think this SO thread discusses that

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.