0

I have

class rel:
   child=''
   parent=''
listPar=[]

and in listPar I have a list of these classes (sorry for terms, I'm not sure if it is called class, is it?) so in listPar I have for example: room book ; book title ; room book;book title

And now im trying to remove all non unique occurences, so I want to have only

room book ; book title in listPar

Let's assume, that i have following code:

variable="Book"
variable2="Author"
toIns=rel()
toIns.parent=variable 
toIns.child=variable2 
listPar.append(toIns) 

toIns2=rel()
toIns2.parent=variable
toIns2.child=variable2 
listPar.append(toIns2) 

and now how to remove all duplicates? (result ->

for elem in listPar:
    print "child:",elem.child,"parent:",elem.parent

#child:author, parent:book

I have tried several things, but none of them seemed to fully work..could you please help me?

2
  • 1
    The term would be objects of the class. Commented Apr 25, 2012 at 9:46
  • 2
    Or instances is also used a lot. Commented Apr 25, 2012 at 9:56

1 Answer 1

5

I'm presuming that the class you have given there isn't the actual class (as it would be worthless), but the easiest thing for you to do here - presuming the order of your elements doesn't matter to you, is to make your list into a set, which will remove all duplicates.

>>> a = ["test", "test", "something", "else"]
>>> a
['test', 'test', 'something', 'else']
>>> set(a)
{'test', 'something', 'else'}

Here I have use strings, but you could use any class that provides the equality operator and hash function. The equality function is used to check if the two classes are the same (as for a custom class, you need to define that) and a hash is used to make sets very efficient. Two classes giving the same hash must be the same. You can have two classes with the same hash that are not the same (it will fall back to the equality operator), but the more this happens the slower it will be. In general, using the sum of the hashes of the components of the class you use to check for equality is a good way to generate a decent hash.

So, for example:

class Book:
    def __init__(self, title, author):
        self.title = title
        self.author = author

    def __eq__(self, other):
        return self.title == other.title and self.author == other.author

    def __hash__(self):
        return hash(self.title)+hash(self.author)

    def __repr__(self):
        return "Book("+repr(self.title)+", "+repr(self.author)+")"

We can use this class like before.

>>> a = [Book("Some Book", "Some Guy"), Book("Some Book", "Some Guy"), Book("Some Other Book", "Some Other Guy")]
>>> a
[Book('Some Book', 'Some Guy'), Book('Some Book', 'Some Guy'), Book('Some Other Book', 'Some Other Guy')]
>>> set(a)
{Book('Some Other Book', 'Some Other Guy'), Book('Some Book', 'Some Guy')}

If you do care about the order of the elements, even after removing duplicates, then you could do this:

def remove_duplicates_preserving_order(seq):
    seen = set()
    return [ x for x in seq if x not in seen and not seen.add(x)]

This works by hacking the dictionary comprehension a little - set.add() always returns 0, so you can check it is false (which it always will be) to add the element to the set.

Edit for update:

Please note that PEP-8 reccomends using CapWords for classes, and lowercase_with_underscores for local variables.

You seem to have a misunderstanding about how Python classes work. This class doesn't make much sense, as these are all class attributes, not instance attributes. This means that they will be the same for all instances of the class, and that's not what you want. This means that when you change them the second time, you will be changing it for all the instances, making them all the same.

To make instance variables (the type you want) you want to create them inside the constructor (__init__()) - check my example class to see how this works. Once you have done this, you then need to implement __eq__() and __hash__() functions so that Python knows what it means for two items of your class to be equal. You can then use the methods I described above (either a set or the function I gave) to remove duplicates.

Note that if this is all you wish to do with your data, a class might be overkill. If you are always going to have two items, you could just use a tuple:

>>> a = [("Book", "Author"), ("Book", "Author"), ("OtherBook", "OtherAuthor")] 
>>> set(a)
{('Book', 'Author'), ('OtherBook', 'OtherAuthor')}

As tuples already define equality for you as a sum of their parts.

Overall, you seem to lack an understanding of how classes are constructed and used in Python - I would suggest you go read up and learn how to use them before anything else, as not doing so will cause you a lot of problems.

Sign up to request clarification or add additional context in comments.

4 Comments

I'd say he was trying to describe relations, like Rel('Room','Book') meaning that a room can contain a book, and then storing those values. But you're totally right, I would've suggested adding __eq__ to Rel, too.
@phg I was really just giving a general example, the idea could be applied anywhere.
I'm sorry, i'm not sure what to do now. Let's assume, that i have following code: class rel: child='' parent='' listPar=[] toIns=rel() toIns.parent=variable toIns.child=variable2 listPar.append(toIns) and now how to remove all duplicates?
@Johnzzz OK, please add your class to the question by editing it so I can read the code.

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.