1

I have a list of python objects and I'd like to remove duplicates in the list based on the time value. For example:

class MyClass(object):

    identifier = models.CharField(max_length=128)
    label = models.CharField(max_length=128)
    stat_time = models.DateTimeField(auto_now_add=True)
    def __unicode__(self):
        return str(self.label)

My list may have several instances of MyClass with the same label but different stat_times. I'd like to trim the list and have only one instance of the label with the latest stat_time.

>>> my_list
[MyClass: xxx, MyClass: yyy, MyClass: yyy, MyClass: zzz]

I'd like to end up with:

>>> my_list
[MyClass: xxx, MyClass: yyy, MyClass: zzz]

Here my_list should only contain one instance of MyClass with the 'yyy' label with the latest stat_time.

I hope I have made that clear. Any suggestions much appreciated.

8
  • Is this a Django question? Commented Jan 3, 2014 at 16:13
  • 1
    I haven't used Django but I imagine it would be easier to make sure copies aren't added in the first place instead of sanitizing the list afterwards. Commented Jan 3, 2014 at 16:15
  • Is the order of items important in your list? Commented Jan 3, 2014 at 16:18
  • What attempts have you made and what's wrong with them? Commented Jan 3, 2014 at 16:21
  • Do you want to filter on __unicode__ or stat_time? As your Q currently stands, it's a bit hard to understand. Commented Jan 3, 2014 at 16:34

2 Answers 2

1

One way you could do it is to create a dict mapping values of label to MyClass instances. You would add each the elements of your list to this dict, but only keep the wanted values.

aDict = dict()
for element in myList:
    s = element.label
    if s not in aDict: # the key is not used yet
        aDict[s] = element
    else:
        aDict[s] = max(aDict[s], element, key = lambda x: x.stat_time)
myList = list(aDict.items()) # iteritems() in Python 2

The lambda expression passed into max tells Python which value to compare when computing the max.

Sign up to request clarification or add additional context in comments.

3 Comments

You can rewrite the max clause like element if element[s].stat_time > aDict[s].stat_time else aDict[s] so that it reads almost like plain English.
Thanks, however will this not just add the first element encountered?
@9000 Good suggestion, I'll keep that in mind for the future.
0

I'm not sure if you should filter your object based on __unicode__(), but here is how I would have done it.

unique_objs = []

for o in my_list:
    if (o.__unicode__(), o.stat_time) in unique_objs:
        continue
    new_list.append(o)
    unique_objs.append(tuple(o.__unicode__(), o.stat_time))

4 Comments

Thanks,. However, that will only get me a unique list of objects based on the name - I need to have a unique list based on name AND latest time for stat_time.
@KevinSheahan I changed the answer. Does this suit your needs?
Thanks. unique_objs.add can only take one argument, and there is no comparison for the latest time but the solution will probably look something like this.
@KevinSheahan I updated the answer, it should work now

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.