Fastest way to remove duplicates in list of lists in Python?

Question

I have a list of lists in Python3, where the data looks like this:

['Type1', ['123', '22'], ['456', '80']]
['Type2', ['456', '80'], ['123', '22']]

The list is quite large, but the above is an example of duplicate data I need to remove. Below is an example of data that is NOT duplicated and does not need to be removed:

['Type1', ['789', '45'], ['456', '80']]
['Type2', ['456', '80'], ['123', '22']]

I've already removed all the exact identical duplicates. What is the fastest way to accomplish this "reversed duplicate" removal in Python3?

I get that you want to remove duplicates, but can you please show us the desired output? — U13-Forward
– U13-Forward, Commented Mar 11, 2019 at 2:00
So, it appears you define "duplicate" as "having the exact same sublists but in any order", yes? So your first example is a duplicate because both type lines have [123,22] and [456,80] but in different order. The second is not a dupe because, though they both have [456,80], the other sublists are different. Is that what you're getting at? Does the order matter inside the sub-lists (can [1,2] and [2,1] be considered identical sublists)? — paxdiablo
– paxdiablo, Commented Mar 11, 2019 at 2:05
Duplicate between which data you are looking for ? And if a duplicate value is found, what next ? Do you want to remove a sublist ? Or replace the duplicate number with zero ? Do you want to leave the duplicates what are found ? Do you want to leave the original on in Type1 list or in Type2 list ? — s3n0
– s3n0, Commented Mar 11, 2019 at 2:22

user447688 · Accepted Answer · 2019-03-11 02:19:58Z

1

Two possibilities:

Convert each sublist to a tuple and insert into a set. Do the same for the compare candidate and compare sets to determine equality.
Establish a sorting method for the sublists, then sort each list of sublists. This will enable easy comparison.

Both these approaches basically work around your problem of sublist ordering; there are lots of other ways.

answered Mar 11, 2019 at 2:19

user447688

Sign up to request clarification or add additional context in comments.

Comments

Plajerity · Accepted Answer · 2019-03-11 02:30:33Z

1

data = [['Type1', ['123', '22'], ['456', '80']],
    ['Type2', ['456', '80'], ['123', '22']]]
myList = []
for i in data:
    myTuple = (i[1], i[2])
    myList.append(myTuple)

print(myList)
for x in myList:
    for y in myList:
        if x==y:
            myList.remove(x)
            break

print(myList)

answered Mar 11, 2019 at 2:30

Plajerity

1612 silver badges11 bronze badges

Collectives™ on Stack Overflow

Fastest way to remove duplicates in list of lists in Python?

2 Answers 2

Comments

Comments

Your Answer

Hot Network Questions

Collectives™ on Stack Overflow

2 Answers 2

Comments

Comments

Your Answer

Sign up or log in

Post as a guest

Related