0

I have a list of lists in Python3, where the data looks like this:

['Type1', ['123', '22'], ['456', '80']]
['Type2', ['456', '80'], ['123', '22']]

The list is quite large, but the above is an example of duplicate data I need to remove. Below is an example of data that is NOT duplicated and does not need to be removed:

['Type1', ['789', '45'], ['456', '80']]
['Type2', ['456', '80'], ['123', '22']]

I've already removed all the exact identical duplicates. What is the fastest way to accomplish this "reversed duplicate" removal in Python3?

4
  • 1
    I get that you want to remove duplicates, but can you please show us the desired output? Commented Mar 11, 2019 at 2:00
  • 2
    So, it appears you define "duplicate" as "having the exact same sublists but in any order", yes? So your first example is a duplicate because both type lines have [123,22] and [456,80] but in different order. The second is not a dupe because, though they both have [456,80], the other sublists are different. Is that what you're getting at? Does the order matter inside the sub-lists (can [1,2] and [2,1] be considered identical sublists)? Commented Mar 11, 2019 at 2:05
  • Duplicate between which data you are looking for ? And if a duplicate value is found, what next ? Do you want to remove a sublist ? Or replace the duplicate number with zero ? Do you want to leave the duplicates what are found ? Do you want to leave the original on in Type1 list or in Type2 list ? Commented Mar 11, 2019 at 2:22
  • The duplicate would be removed if it's of Type1. Commented Mar 11, 2019 at 16:36

2 Answers 2

1

Two possibilities:

  1. Convert each sublist to a tuple and insert into a set. Do the same for the compare candidate and compare sets to determine equality.

  2. Establish a sorting method for the sublists, then sort each list of sublists. This will enable easy comparison.

Both these approaches basically work around your problem of sublist ordering; there are lots of other ways.

Sign up to request clarification or add additional context in comments.

Comments

1
data = [['Type1', ['123', '22'], ['456', '80']],
    ['Type2', ['456', '80'], ['123', '22']]]
myList = []
for i in data:
    myTuple = (i[1], i[2])
    myList.append(myTuple)

print(myList)
for x in myList:
    for y in myList:
        if x==y:
            myList.remove(x)
            break

print(myList)

Comments

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.