0

I have a code which gives back 10 lists of numbers.

def my_random_list(l: list):
    return sorted(random.sample(list(set(l)), 6))


for _ in range(10):
    print(sorted(my_random_list([i for i in range(1, 43)])))

I need to count how many duplicates are there in this 10 lists. How to do it in short and efficient way?

5
  • 1
    Can you share what you're trying to do with this code? Commented Mar 1, 2020 at 17:15
  • and what did you try so far? Commented Mar 1, 2020 at 17:16
  • 1
    How about use collections.Counter. Counter is bag data structure of python. If you put all your lists into a Counter, you can get what elements are duplicate(greater than 2) and how many overlap. Commented Mar 1, 2020 at 17:16
  • 1
    You sort the same list twice, it's not needed Commented Mar 1, 2020 at 17:17
  • 1
    @PedroLobito I'm just trying to get my python skills solving interesting tasks. Commented Mar 1, 2020 at 18:09

5 Answers 5

1

You can use:

import random
from collections import defaultdict

def my_random_list(l: list):
    return sorted(random.sample(list(set(l)), 6))

repeated = defaultdict(int)
for _ in range(10):
    rl = my_random_list([i for i in range(1, 43)])
    for x in rl:
        repeated[x] += 1
    print(sorted(rl))

repeated = {k:v for k,v in repeated.items() if v > 1}
print(repeated)
# {2: 2, 5: 3, 19: 4, 21: 4, 4: 3, 8: 2, 14: 2, 38: 3, 9: 3, 24: 2, 40: 3, 42: 2, 10: 2, 22: 3, 32: 2, 18: 3, 34: 2, 30: 2, 31: 3}
print(len(repeated.keys())) # how many duplicates

Demo

Sign up to request clarification or add additional context in comments.

Comments

1

Convert the list to a set, which automatically gets rid of duplicates. Then compare their size:

l = [1,2,3,4,5,6,7,7,6,5,4]
print(len(l) - len(set(l)))

Comments

1

If your intention is to find out the duplicates across the 10 lists, you can try the following -

# Import Counter from collections 
In [11]: from collections import Counter

# Your definition of my_random_list
In [12]: def my_random_list(l: list):
    ...:     return sorted(random.sample(list(set(l)), 6))
    ...:

# Copying your version of creating 10 lists into a lists variable (calling the sorted() here is superfluous in my opinion)
In [13]: lists = [sorted(my_random_list([i for i in range(1, 43)])) for _ in range(10)]

# Count all the entries across all the 10 lists
In [14]: counter = Counter([])

# You can add multiple Counter instances to produce a "merged" Counter
In [15]: for l in lists:
    ...:     counter += Counter(l)

# Find the entries whose value exists more than once
In [16]: duplicates = [k for k,v in counter.items() if v > 1]

# Printing all the duplicate entries across the lists
In [17]: duplicates
Out[17]: [6, 16, 20, 37, 38, 2, 9, 29, 1, 18, 33, 3, 17, 19, 31, 15, 21, 42, 41, 11]

# Length of the duplicate list
In [18]: len(duplicates)
Out[18]: 20

You can read-up on Counter here

Comments

1

A statement of problem is not clear, I assume you want to calculate duplicates in concatenation of these 10 arrays. In this case you could use advantages of numpy.unique:

import random
import numpy as np
collection = [my_random_list(list(range(1, 43))) for i in range(10)]
conc = np.concatenate(collection) # concatenated list
items, cnt = np.unique(conc, return_counts=True) # sorted set of unique items and their counts
output = items[cnt>1] # items that appears more than once

Comments

1

collections.Counter and itertools.chain will be helpful.

import random

source = [i for i in range(1, 43)]


def my_random_list():
    return sorted(random.sample(source, 6))


random_lists = [my_random_list() for _ in range(10)]
print(random_lists)

Here are 10 random lists(6 length for each).

>>> [[2, 4, 10, 18, 20, 30], [4, 12, 13, 19, 21, 27], [10, 11, 18, 26, 32, 33], [4, 11, 12, 17, 38, 42], [12, 22, 28, 38, 40, 41], [2, 11, 22, 30, 35, 36], [4, 6, 22, 24, 32, 34], [1, 3, 5, 25, 31, 33], [25, 29, 31, 32, 33, 35], [12, 16, 28, 31, 37, 41]]

Then you can count it.

from collections import Counter
from itertools import chain


counter = Counter(chain(*random_lists))
print(counter)
>>> Counter({4: 4, 12: 4, 11: 3, 32: 3, 33: 3, 22: 3, 31: 3, 2: 2, 10: 2, 18: 2, 30: 2, 38: 2, 28: 2, 41: 2, 35: 2, 25: 2, 20: 1, 13: 1, 19: 1, 21: 1, 27: 1, 26: 1, 17: 1, 42: 1, 40: 1, 36: 1, 6: 1, 24: 1, 34: 1, 1: 1, 3: 1, 5: 1, 29: 1, 16: 1, 37: 1})

And filter the counter with comprehension.

results = [k for k, v in counter.items() if v >= 2]
print(results)
>>> [2, 4, 10, 18, 30, 12, 11, 32, 33, 38, 22, 28, 41, 35, 25, 31]

Comments

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.