1

Okay, I have two lists, List 1 and List 2. I want to find all of the items that are in both list 1 and list 2, and remove them from list 1. The first way I've thought about doing this is looping through list 1 and then looping through list 2 to see if it is in list 2 but that seems slow and inefficient when scaled up. Is there a more efficient way of doing this?

Also, these lists will be ordered alphabetically (they're strings), if that helps anything.

I'm using python, but I'm also wondering from a general programming perspective.

list1 = ['bar','foo','hello','hi']
list2 = ['alpha','bar','hello','xam']

list1 would become ['foo','hi']

1
  • You can try list(set(list1) - (set(list1) & set(list2))). Commented Sep 16, 2020 at 14:07

3 Answers 3

6

In python, you'd probably want to use a set:

intersection = set(list1).intersection(list2)

This will return a set which destroys the order (among other things), but you can always use that set to filter list1 afterward:

list1 = [x for x in list1 if x not in intersection]

The intersection is most useful if you actually want to use the set. As pointed out in the comments, it's not actually necessary if you don't want a set at all:

set2 = set(list2)
list1 = [x for x in list1 if x not in set2]
Sign up to request clarification or add additional context in comments.

4 Comments

Using sample output from question, this gave set(['bar', 'hello']) which is not what poster wanted. Is this version dependent? (I'm on 2.7.6?)
Actually, you do not need the intersection, since all the elements you are looping over are from list1 anyway.
@wnnmaw -- I originally misunderstood OP's question. I updated to add a not in the list-comp to get the expected output.
@tobias_k -- good point. In the general case, I'm used to wanting to work with the set and not just use it to filter a list. I've updated.
4

Use a set to get the difference between the two:

list1 = ['bar','foo','hello','hi']
list2 = ['alpha','bar','hello','xam']

set1 = set(list1)
set2 = set(list2)
set1 - set2

Outputs:

set(['hi', 'foo'])

As noted by @chepner, using set.difference, only the first needs to be converted to a set

set1.difference(list2)

If order is important, make one of them a set, and compare the other against it:

set2 = set(list2)
[x for x in list1 if x not in set2]

Outputs:

['foo', 'hi']

2 Comments

- is an operator form of the difference method, so you could use set(list1).difference(list2) to avoid an extra traversal of list2.
thanks, edited for that. but it's unfortunate you cant do set - list
1

Here's a solution using a general programming approach, not using sets, and not particularly optimized. It relies on the two lists being sorted.

list1 = ['a', 'b', 'd', 'f', 'k']
list2 = ['c', 'd', 'i']
result = []

i1 = 0
i2 = 0
while i1 < len(list1) and i2 < len(list2):
    # invariants:
    #    list1[i1] not in list2[:i2], and
    #    result == (list1[:i1] with elements of list2[:i2] omitted)
    #
    if list1[i1] < list2[i2]:
        # By assumption, list1[i1] not in list2[:i2],
        # and because list2 is sorted, the true 'if' condition
        # implies that list1[i1] isn't in list2[i2:] either;
        # that is, it isn't in list2 at all.
        result.append(list1[i1])
        i1 += 1
    elif list1[i1] > list2[i2]:
        # can't decide membership of list1[i1] yet;
        # advance to next element of list2 and loop again
        i2 += 1
    else:
        # list1[i1] == list2[i2], so omit this element
        i1 += 1
        i2 += 1

# Add any remaining elements of list1 to tail of result
if i1 < len(list1):
    result.extend(list1[i1:])

print(result)

Result: ['a', 'b', 'f', 'k']

Comments

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.