Partition of a set in python

Question

I have b buckets 0....b-1 and m apples 0....m-1. In the beginning all apples are placed in bucket 0.

Then running some analysis causes the apples to be moved among the buckets. I have already implemented this by having a 2D list (as buckets) in which apple ids are removed and appended whenever they need to be moved between buckets. This is, however, very inefficient for my analysis as these movements are in the order of millions or billions. So, I was wondering if there is any better solution out there to implement such a structure?

By the way, the title was chosen as this is very similar to the partitions of a set problem in which no member can be placed in more than 1 subset. Here is also an example with 4 apples and 3 buckets to make it more clear:

time 0:
a=[[0,1,2,3],[],[]]
time 1: (say apple 3 needs to be moved to bucket 2)
a=[[0,1,2],[],[3]]

Jean-François Fabre · Accepted Answer · 2017-01-16 20:28:10Z

6

The problem with removing an element from a list is that it takes O(n): it takes the order of the number of elements in the list to remove that item.

You better use sets or even better a bitarray that will work in O(1).

For example:

m = 50 #the number of apples
b = 10 #the number of buckets
fls = [False]*m
a = [bitarray(fls) for _ in range(b)]
a[0] = bitarray([True]*m) #add a filled bucket at index 0

def move_apple(apple_id,from_bucket,to_bucket):
    a[from_bucket][apple_id] = False
    a[to_bucket][apple_id] = True

edited Jan 16, 2017 at 20:28

Jean-François Fabre♦

141k24 gold badges179 silver badges246 bronze badges

answered Jan 16, 2017 at 20:06

willeM_ Van Onsem

482k33 gold badges483 silver badges624 bronze badges

Sign up to request clarification or add additional context in comments.

4 Comments

Jean-François Fabre Over a year ago

[False for _ in range(m)] is overkill. For immutable objects you can do [False]*m. Using bitarray is a very nice idea.

Jean-François Fabre Over a year ago

you're welcome (that was minor). I took the liberty to slightly edit your post BTW. Removed "(with sets)" and fixed "beter" to "better". Much beter like that :)

juanpa.arrivillaga Over a year ago

It would be cool if you showed some timing experiments comparing the performance to a boolean list.

willeM_ Van Onsem Over a year ago

@juanpa.arrivillaga: If I find some time this night, I will try to work on that...

Stefan Pochmann · Accepted Answer · 2017-01-16 20:13:49Z

3

Just use an array where for each apple you store the bucket number?

time 0:
a=[0,0,0,0]
time 1: (say apple 3 needs to be moved to bucket 2)
a=[0,0,0,2]

answered Jan 16, 2017 at 20:13

Stefan Pochmann

29k9 gold badges48 silver badges117 bronze badges

1 Comment

user2517676 Over a year ago

I need the reverse structure to avoid using index later in my script. Index() is also very heavy. O(n) I assume.

Collectives™ on Stack Overflow

Partition of a set in python

2 Answers 2

4 Comments

1 Comment

Your Answer

Hot Network Questions

Collectives™ on Stack Overflow

2 Answers 2

4 Comments

1 Comment

Your Answer

Sign up or log in

Post as a guest

Related