7
my_list = [my_list[int((i**2 + i)/2):int((i**2 + 3*i + 3)/2)] for i in range(int((-1 + (1 + 8*len(my_list))**0.5)/2))]

Is there a neater solution to grouping the elements of a list into subgroups of increasing size than this?

Examples:

[1, 2, 3, 4, 5, 6, 7, 8, 9, 10, 11] --> [[1], [2, 3], [4, 5, 6], [7, 8, 9, 10]]
[1, 2, 3, 4] --> [[1], [2, 3]]
[1, 2, 3, 4, 5, 6] --> [[1], [2, 3], [4, 5, 6]]

EDIT

Here are the results from timeit:

from timeit import Timer
from itertools import count

def martijn(it):
    it = iter(it)
    return list([next(it) for _ in range(s)] for s in count(1))

def mathematical(it):
    upper_bound = int(((1 + 8*len(it))**0.5 + 1)//2)
    return [it[i*(i-1)//2:i*(i+1)//2] for i in range(1, upper_bound)]

def time(test, n):
    a = Timer(lambda: martijn(test)).timeit(n)
    b = Timer(lambda: mathematical(test)).timeit(n)
    return round(a, 3), round(b, 3)

>>> for i in range(8):
        loops = 10**max(0, (6-i))
        print(time([n for n in range(10**i)], loops), loops)
(6.753, 4.416) 1000000
(1.166, 0.629) 100000
(0.366, 0.123) 10000
(0.217, 0.036) 1000
(0.164, 0.017) 100
(0.157, 0.017) 10
(0.167, 0.021) 1
(1.749, 0.251) 1
>>> for i in range(8):
        loops = 10**max(0, (6-i))
        print(time(range(10**i), loops), loops)
(6.721, 4.779) 1000000
(1.184, 0.796) 100000
(0.367, 0.173) 10000
(0.218, 0.051) 1000
(0.202, 0.015) 100
(0.178, 0.005) 10
(0.207, 0.002) 1
(1.872, 0.005) 1
9
  • 7
    Good Lord. For the sake of Python's philosophy, I sincerely hope so. Commented Apr 11, 2014 at 14:29
  • 2
    What's the pattern you're trying to achieve here? Should each sublist be 1 element longer than the previous sublist? Commented Apr 11, 2014 at 14:29
  • 1
    @2rs2ts : Yes. I will add more examples. Commented Apr 11, 2014 at 14:30
  • There's certainly a more readable way, that's for sure. Commented Apr 11, 2014 at 14:32
  • 2
    @thefourtheye Why would the 4 want to stay? The groups need to be in increasing size in steps of 1. The 4 will feel outnumbered. Commented Apr 11, 2014 at 14:37

6 Answers 6

13

Using a generator expression:

from itertools import count

try:
    _range = xrange
except NameError:
    # Python 3
    _range = range


def incremental_window(it):
    """Produce monotonically increasing windows on an iterable.

    Only complete windows are yielded, if the last elements do not form
    a complete window they are ignored.

    incremental_window('ABCDEF') -> ['A'], ['B', 'C'], ['D', 'E', 'F']
    incremental_window('ABCDE') -> ['A'], ['B', 'C']

    """
    it = iter(it)
    return ([next(it) for _ in _range(s)] for s in count(1))

Demo:

>>> list(incremental_window([1, 2, 3, 4, 5, 6, 7, 8, 9, 10, 11]))
[[1], [2, 3], [4, 5, 6], [7, 8, 9, 10]]
>>> list(incremental_window([1, 2, 3, 4]))
[[1], [2, 3]]
>>> list(incremental_window([1, 2, 3, 4, 5, 6]))
[[1], [2, 3], [4, 5, 6]]

This is a generator that'll work with any iterable, including endless iterables:

>>> from itertools import count
>>> for window in incremental_window(count()):
...     print window
...     if 25 in window:
...         break
... 
[0]
[1, 2]
[3, 4, 5]
[6, 7, 8, 9]
[10, 11, 12, 13, 14]
[15, 16, 17, 18, 19, 20]
[21, 22, 23, 24, 25, 26, 27]

You could make that a one-liner with a little cheating to 'inline' the iter() call on your list object:

list([next(it) for _ in _range(s)] for it in (iter(my_list),) for s in count(1))
Sign up to request clarification or add additional context in comments.

5 Comments

+1 for suggesting writing a function on StackOverflow. You don't see that often enough.
is this because StopIteration errors result in the [next(it) for _ in _range(s)] expression not being returned?
@njzk2: The generator expression exits when next(it) raises StopIteration, yes. The exception is propagated, actually, it is list() that catches it.
@MartijnPieters: Thanks, I learnt something today! using list on a generator and generating a list have different behavior in this matter !
This is definitely the neatest answer. Accepted. I have put the timing of the two main types of solutions in the question. Although this one is neater, the mathematical one is faster.
1

I'm not honestly totally clear why you want to do this, which I mention purely because there's likely a task-specific way to answer your question, but I would argue that the following is at least clearer:

def increasing_groups(l):
    current_size = 1
    while l:
        yield l[:current_size]
        l = l[current_size:]
        current_size += 1

at which point you can get it via list(increasing_groups(some_list)).

Comments

1

You can keep track of the number of items to slice with itertools.count and you can pick the items with itertools.islice.

# Initializations and declarations
data = [1, 2, 3, 4, 5, 6, 7, 8, 9, 10, 11]
from itertools import count, islice
counter, it = count(0), iter(data)

# Actual list construction
result = [[item] + list(islice(it, next(counter))) for item in it]

# Making sure that the last item of the list is consistent with the previous item
if len(result) > 1 and len(result[-1]) <= len(result[-2]): del result[-1]

print(result)
# [[1], [2, 3], [4, 5, 6], [7, 8, 9, 10]]

The important thing is

if len(result) > 1 and len(result[-1]) <= len(result[-2]): del result[-1]

this line makes sure that, the last item in the list stays only if its length is greater than the last but one.

Comments

1
def incr_grouped(iterable):
    it, n = iter(iterable), 1
    while True:
        yield [next(it) for _ in range(n)]
        n += 1

The key here is that StopIteration exception of next(it) breaks the while loop as well. This means that you may loose the last elems which are not fitted in a group.

>>> list(incr_grouped('ABCDEF'))
[['A'], ['B', 'C'], ['D', 'E', 'F']]
>>> list(incr_grouped([1, 2, 3, 4, 5, 6, 7, 8, 9, 10, 11]))
[[1], [2, 3], [4, 5, 6], [7, 8, 9, 10]]

It can be made even more compact using itertools. Check Martijn Pieters' answer.

Comments

1

Yes, there is simple answer.

>>> test = [1, 2, 3, 4, 5, 6, 7]
>>> bound = int((-1 + (1 + 8 * len(test)) ** 0.5) / 2)
>>> res = [test[(i + 1) * i // 2 : (i + 1) * (i + 2) // 2] for i in xrange(bound)]
>>> res
[[1], [2, 3], [4, 5, 6]]

Because the size of each slice is an arithmetic sequence. And the equation to compute the total number of arithmetic sequence is known. So we could simply compute the begin and end index of each slice directly with that equation.

8 Comments

Yes, the solution is int((-1 + (1 + 8*len(my_list))**0.5)/2)
This produces overlapping windows: [[1], [2, 3], [3, 4, 5], [4, 5, 6, 7], [5, 6, 7, 8, 9]]
Have you tested this? It doesn't work because the start of your slice is simply i. The slicing will look like (0, 1) (1, 3) (2, 5) instead of (0, 1) (1, 3) (3, 6), which is incorrect.
@Scorpion_God Updated the answer. Rush for ten minutes for a PC~
@MartijnPieters I've updated the answer. I was with an iPad, and unable to test it.
|
1

This

(n * (n - 1) / 2, n * (n + 1) / 2)

Gives you, according to Gauss, the start and end indices of the nth element of your new list.

Therefore

my_list[n * (n - 1) / 2 : n * (n + 1) / 2]

Is the nth element of the list, and with a bit blunt filtering:

my_list = [1, 2, 3, 4, 5, 6, 7, 8, 9, 10, 11]
[my_list[n * (n - 1) / 2: n * (n + 1)/ 2] for n in range(1, len(my_list)) if n * (n + 1)/ 2 <= len(my_list)]
# [[1], [2, 3], [4, 5, 6], [7, 8, 9, 10]]

A proper loop with an actual break would probably be better, though

Edit

Now that I know about how StopIteration is caught by list (Thank you Martjin), a simple closing condition can be done using:

list(my_list[n * (n - 1) // 2: n * (n + 1) // 2] for n in count(1) if iter(my_list[n * (n + 1)/ 2:]).next() > -1)

Provided -1 is lower than any item in your list. (And the floor divisions are for integer typing in python 3.)

11 Comments

Can't the if limit be folded into the range()? E.g. calculate the required upper limit to n up front.
The upper limit is int((-1 + (1 + 8*len(my_list))**0.5)/2), which is the positive solution to the quadratic equation derived from s = n*(n+1)/2
Then use that + 1 instead of the upper bound of len(my_list) in the range() call, and the if filter can go.
[my_list[n * (n - 1) // 2: n * (n + 1) // 2] for n in range(1, int((-1 + (1 + 8 * len(my_list)) ** 0.5) / 2) + 1)] (with some floor division thrown in)
the floor division is not useful, as (n * (n - 1)) is pair by definition.
|

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.