Removing duplicates in list of lists

Question

I have a list consisting of lists, and each sublist has 4 items(integers and floats) in it. My problem is that I want to remove those sublists whose index=1 and index=3 match with other sublists.

[[1, 2, 0, 50], [2, 19, 0, 25], [3, 12, 25, 0], [4, 18, 50, 50], [6, 19, 50, 67.45618854993529], [7, 4, 50, 49.49657024231138], [8, 12, 50, 41.65340802385248], [9, 12, 50, 47.80600357035001], [10, 18, 50, 47.80600357035001], [11, 18, 50, 53.222014760339356], [12, 18, 50, 55.667812693447615], [13, 12, 50, 41.65340802385248], [14, 12, 50, 47.80600357035001], [15, 13, 50, 47.80600357035001], [16, 3, 50, 49.49657024231138], [17, 3, 50, 49.49657024231138], [18, 4, 50, 49.49657024231138], [19, 5, 50, 49.49657024231138]]

For example,[7, 4, 50, 49.49657024231138] and [18, 4, 50, 49.49657024231138] have the same integers at index 1 and 3. So I want to remove one, which one doesn't matter.

I have looked at codes which allow me to do this on the basis of single index.

def unique_items(L):
found = set()
for item in L:
    if item[1] not in found:
        yield item
        found.add(item[1])

I have been using this code which allows me to remove lists but only on the basis of a single index.(I haven't really understood the code completely.But it is working.)

Hence, the problem is removing sublists only on the basis of duplicate values of index=1 and index=3 in the list of lists.

alexanderlukanin13 · Accepted Answer · 2015-03-14 05:33:29Z

3

If you need to compare (item[1], item[3]), use a tuple. Tuple is hashable type, so it can be used as a set member or dict key.

def unique_items(L):
    found = set()
    for item in L:
        key = (item[1], item[3])  # use tuple as key
        if key not in found:
            yield item
            found.add(key)

answered Mar 14, 2015 at 5:33

alexanderlukanin13

4,74528 silver badges29 bronze badges

Sign up to request clarification or add additional context in comments.

Comments

Tim Pietzcker · Accepted Answer · 2015-03-14 05:33:53Z

1

This is how you could make it work:

def unique_items(L):
    # Build a set to keep track of all the indices we've found so far
    found = set()  
    for item in L:
        # Now check if the 2nd and 4th index of the current item already are in the set
        if (item[1], item[3]) not in found: 
            # if it's new, then add its 2nd and 4th index as a tuple to our set
            found.add((item[1], item[3])
            # and give back the current item 
            # (I find this order more logical, but it doesn't matter much)
            yield item

answered Mar 14, 2015 at 5:33

Tim Pietzcker

337k59 gold badges520 silver badges572 bronze badges

Comments

Saksham Varma · Accepted Answer · 2015-03-14 07:12:35Z

0

This should work:

from pprint import pprint

d = {}
for sublist in lists:
    k = str(sublist[1]) + ',' + str(sublist[3])
    if k not in d:
        d[k] = sublist
pprint(d.values())

answered Mar 14, 2015 at 7:12

Saksham Varma

2,14015 silver badges16 bronze badges

Collectives™ on Stack Overflow

Removing duplicates in list of lists

3 Answers 3

Comments

Comments

Comments

Your Answer

Hot Network Questions

Collectives™ on Stack Overflow

3 Answers 3

Comments

Comments

Comments

Your Answer

Sign up or log in

Post as a guest

Related