5

I have two lists like so

found = ['CG', 'E6', 'E1', 'E2', 'E4', 'L2', 'E7', 'E5', 'L1', 'E2BS', 'E2BS', 'E2BS', 'E2', 'E1^E4', 'E5']
expected = ['E1', 'E2', 'E4', 'E1^E4', 'E6', 'E7', 'L1', 'L2', 'CG', 'E2BS', 'E3']

I want to find the differences between both lists.
I have done

list(set(expected)-set(found))

and

list(set(found)-set(expected))

Which returns ['E3'] and ['E5'] respectively.

However, the answers I need are:

'E3' is missing from found.
'E5' is missing from expected.
There are 2 copies of 'E5' in found.
There are 3 copies of 'E2BS' in found.
There are 2 copies of 'E2' in found.

Any help/suggestions are welcome!

3 Answers 3

8

The collections.Counter class will excel at enumerating the differences between multisets:

>>> from collections import Counter
>>> found = Counter(['CG', 'E6', 'E1', 'E2', 'E4', 'L2', 'E7', 'E5', 'L1', 'E2BS', 'E2BS', 'E2BS', 'E2', 'E1^E4', 'E5'])
>>> expected = Counter(['E1', 'E2', 'E4', 'E1^E4', 'E6', 'E7', 'L1', 'L2', 'CG', 'E2BS', 'E3'])
>>> list((found - expected).elements())
['E2', 'E2BS', 'E2BS', 'E5', 'E5']
>>> list((expected - found).elements())

You might also be interested in difflib.Differ:

>>> from difflib import Differ
>>> found = ['CG', 'E6', 'E1', 'E2', 'E4', 'L2', 'E7', 'E5', 'L1', 'E2BS', 'E2BS', 'E2BS', 'E2', 'E1^E4', 'E5']
>>> expected = ['E1', 'E2', 'E4', 'E1^E4', 'E6', 'E7', 'L1', 'L2', 'CG', 'E2BS', 'E3']
>>> for d in Differ().compare(expected, found):
...     print(d)

+ CG
+ E6
  E1
  E2
  E4
+ L2
+ E7
+ E5
+ L1
+ E2BS
+ E2BS
+ E2BS
+ E2
  E1^E4
+ E5
- E6
- E7
- L1
- L2
- CG
- E2BS
- E3
Sign up to request clarification or add additional context in comments.

1 Comment

Great answer on the difflib.Differ. I began using that sometimes instead of sets.
4

Leverage the Python set class and Counter class instead of rolling your own solution:

  1. symmetric_difference: finds elements that are either in one set or the other, but not both.
  2. intersection: finds elements in common with the two sets.
  3. difference: which is essentially what you did by subtracting one set from another

Code examples

  • found.difference(expected) # set(['E5'])
    
  • expected.difference(found) # set(['E3'])
    
  • found.symmetric_difference(expected) # set(['E5', 'E3'])
    
  • Finding copies of objects: this question was already referenced. Using that technique gets you all duplicates, and using the resultant Counter object, you can find how many duplicates. For example:

    collections.Counter(found)['E5'] # 2
    

Comments

2

You've already answered the first two:

print('{0} missing from found'.format(list(set(expected) - set(found)))
print('{0} missing from expected'.format(list(set(found) - set(expected)))

The second two require you to look at counting duplicates in lists, for which there are many solutions to be found online (including this one: Find and list duplicates in a list?).

Comments

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.