3

Edited for the confusion in the problem, thanks for the answers!

My original problem was that I have a list [1,2,3,4,5,6,7,8], and I want to select every chunk of size x with gap of one. So if I want to select select every other chunk of size 2, the outcome would be [1,2,4,5,7,8]. A chunk size of three would give me [1,2,3,5,6,7].

I've searched a lot on slicing and I couldn't find a way to select chunks instead of element. Make multiple slice operations then join and sort seems a little too expensive. The input can either be a python list or numpy ndarray. Thanks in advance.

4
  • It's easy to do with array reshaping, but may not be faster if you start with a list. Arrays have a non trivial creation cost. What size of lists are working with? Commented Apr 24, 2016 at 18:46
  • Are you sure your example is right? I would expect every other chunk of size 2 be [1,2,5,6]. eg. [1,2,3,4,5,6,7,8] -> [[1,2], [3,4], [5,6], [7,8]] -> [[1,2], [5,6]] -> [1, 2, 5, 6]. Commented Apr 24, 2016 at 19:09
  • There's an ambiguity in the description. I first read it as pick (1,2), skip 3, pick (4,5), etc. Or it could be pick (1,2), skip (2,3), pick (2,4), skip (3,4) etc. Or as you have it, pick (1,2), skip (3,4), pick (5,6) etc. Commented Apr 24, 2016 at 19:14
  • It's actually a channel for a image so let's say it's 1920 * 1080 and I'm slicing on it's columns. And I have changed the problem for the confusion. Dunes's interpretation is also interesting to think of though. Commented Apr 25, 2016 at 19:30

6 Answers 6

1

To me it seems, you want to skip one element between chunks until the end of the input list or array.

Here's one approach based on np.delete that deletes that single elements squeezed between chunks -

out = np.delete(A,np.arange(len(A)/(x+1))*(x+1)+x)

Here's another approach based on boolean-indexing -

L = len(A)
avoid_idx = np.arange(L/(x+1))*(x+1)+x
out = np.array(A)[~np.in1d(np.arange(L),avoid_idx)]

Sample run -

In [98]: A = [51,42,13,34,25,68,667,18,55,32] # Input list

In [99]: x = 2

# Thus, [51,42,13,34,25,68,667,18,55,32]
                ^        ^         ^        # Skip these

In [100]: np.delete(A,np.arange(len(A)/(x+1))*(x+1)+x)
Out[100]: array([ 51,  42,  34,  25, 667,  18,  32])

In [101]: L = len(A)
     ...: avoid_idx = np.arange(L/(x+1))*(x+1)+x
     ...: out = np.array(A)[~np.in1d(np.arange(L),avoid_idx)]
     ...: 

In [102]: out
Out[102]: array([ 51,  42,  34,  25, 667,  18,  32])
Sign up to request clarification or add additional context in comments.

1 Comment

How about clever and fast pure list version?
1

First off, you can create an array of indices then use np.in1d() function in order to extract the indices that should be omit then with a simple not operator get the indices that must be preserve. And at last pick up them using a simple boolean indexing:

>>> a = np.array([1,2,3,4,5,6,7,8])
>>> range_arr = np.arange(a.size)
>>> 
>>> a[~np.in1d(range_arr,range_arr[2::3])]
array([1, 2, 4, 6, 8])

General approach:

>>> range_arr = np.arange(np_array.size) 
>>> np_array[~np.in1d(range_arr,range_arr[chunk::chunk+1])]

3 Comments

Works now! Not so NumPythonic now, is it? ;) Lovely idea.
Also, add a note maybe that this requires a to be a NumPy array.
@Divakar Thanks, yeah I think so. Actually this was what came in my mind at first glance but since I just gave too much coffee to myself tonight I wrote it so much complicated :-D.
1

Using a pure python solution:

This assumes the desired items are: [yes, yes, no, yes, yes, no, ...]

Quicker to code, slower to run:

data = [1, 2, 3, 4, 5, 6, 7, 8]
filtered = [item for i, item in enumerate(data) if i % 3 != 2]
assert filtered == [1, 2, 4, 5, 7, 8]

Slightly slower to write, but faster to run:

from itertools import cycle, compress

data = [1, 2, 3, 4, 5, 6, 7, 8]
selection_criteria = [True, True, False]
filtered = list(compress(data, cycle(selection_criteria)))
assert filtered == [1, 2, 4, 5, 7, 8]

The second example runs in 66% of the time the first example does, and is also clearer and easier to change the selection criteria

Comments

1

A simple list solution

>> ll = [1,2,3,4,5,6,7,8]
>> list(itertools.chain(*zip(ll[::3],ll[1::3])))
[1, 2, 4, 5, 7, 8]

At least for this case of chunks of size 2, skipping one value between chunks. The number ll[] slicings determine the chunk size, and the slicing step determines the chunk spacing.

As I commented there is some ambiguity in the problem description, so I hesitate to generalize this solution more until that is cleared up.

It may be easier to generalize the numpy solutions, but they aren't necessarily faster. Conversion to arrays has a time overhead.

list(itertools.chain(*zip(*[ll[i::6] for i in range(3)])))

produces chunks of length 3, skipping 3 elements.

zip(*) is an idiomatic way of 'transposing' a list of lists

itertools.chain(*...) is an idiomatic way of a flattening a list of lists.

Another option is a list comprehension with a condition based on item count

[v for i,v in enumerate(ll) if i%3]

handily skips every 3rd item, same as your example. (0<(i%6)<4) keeps 3, skips 3.

3 Comments

Would say itertools.chain.from_iterable(...) is the more idiomatic way. A combination of compress and cycle from itertools is actually pretty fast, and very easy to generalise.
I'm stuck back in 2.5 before from_iterable. :) Or rather I must have learned of chain(* before they added that constructor.
I love argument unpacking, but I always feel it's a bit wasteful when you're unpacking into a varargs argument... I got frustrated enough to look for an alternative to chain(* once and got lucky. Wish there was one for zip too.
0

This should do the trick:

step = 3
size = 2
chunks = len(input) // step
input = np.asarray(input)
result = input[:chunks*step].reshape(chunks, step)[:, :size]

Comments

0

A simple list comprehension can do the job: [ L[i] for i in range(len(L)) if i%3 != 2 ] For chunks of size n [ L[i] for i in range(len(L)) if i%(n+1) != n ]

Comments

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.