4

In python given an array like so:

a = [ 0, 1, 3, 4, 6, 7, 8, 10, 14 ]

I would like to split this into three uneven groups, such that I end up with something like this:

b = [0, 1, 3, 4]
c = [6, 7, 8]
d = [10, 14]

I want to group numbers by multiples of 5. So any integers from 0 - 4 would end up in the first array, 5 - 9 in the second, and so on.

2
  • 4
    How do you specify how long you want b and c to be? In other words, what properties do you want the unevenness to have? Commented Jun 1, 2013 at 2:12
  • Chunks of a maximum of 5, yes. Commented Jun 1, 2013 at 2:42

3 Answers 3

3

Itertools.groupby is always the answer!

Here we round each number down to the nearest 5, and then group by equal numbers:

>>> for n, g in itertools.groupby(a, lambda x: round(x/5)*5):
    print list(g)

[0, 1, 3, 4]
[6, 7, 8]
[10, 14]
Sign up to request clarification or add additional context in comments.

Comments

0

We can be more or less time efficient if we know something about the numbers we're working with. We could also come up with a very quick one that's terribly memory inefficient, but consider this, if it fits your purposes:

#something to store our new lists in
range = 5 #you said bounds of 5, right?
s = [ [] ]
for number in a:
    foundit = false
    for list in s:
        #deal with first number
        if len( list ) == 0:
            list.append( number )
        else:
            #if our number is within the same range as the other number, add it
            if list[0] / range == number / range:
                foundit = true
                list.append( number )
    if foundit == false:
       s.append( [ number ] )

Comments

0

Now that I understand your definition of groups better, I think this relatively simple answer will not only work, it should also be very fast:

from collections import defaultdict

a = [0, 1, 3, 4, 6, 7, 8, 10, 14]
chunk_size = 5
buckets = defaultdict(list)

for n in a:
    buckets[n/chunk_size].append(n)

for bucket,values in sorted(buckets.iteritems()):
    print '{}: {}'.format(bucket, values)

Output:

0: [0, 1, 3, 4]
1: [6, 7, 8]
2: [10, 14]

2 Comments

To further explain my example a bit let me explain what the end result is. Currently I am querying our monitoring system for time series data so I can craft a report. If I query for 1 hour of data for example, I want to split that up into 5 min chunks, all times are returned in epoc. Because it comes from our monitoring system there may be gaps in the timestamps, a machine failed for example so data was unavailable for that period of time. Thanks for the help.
Oh, in that case see my revised answer.

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.