1

I want to select rows in Django which have multiple rows with duplicated values.

For example I have a model which looks like this

class User:
    first_name = models.CharField(...)
    last_name = models.CharField(...)
    ...

and a database looking like this

first_name | last_name | ...
---------- | --------- | ...
Paul       | Adams     | ...
John       | Smith     | ...
Tom        | Mueller   | ...
John       | Smith     | ...
John       | Adams     | ...
Paul       | Adams     | ...

and I want to get the first_name and last_name values which exist multiple times in the database. The other values must not be equal of the rows. In this example I want to get "John Smith" and "Paul Adams".

5
  • Is this an answer to your question? Commented Nov 26, 2020 at 10:11
  • @Brambor No it isn't. I found this question already, but it only works if the value of one field is duplicated, not of multiple fields. Commented Nov 26, 2020 at 10:24
  • Can't you just use the answer twice to filter by two columns? Commented Nov 26, 2020 at 10:26
  • I tried it and it didn't work, maybe I did something wrong. I can try again later. Commented Nov 26, 2020 at 10:28
  • try doing something like .values("first_field", "second_field") Commented Nov 26, 2020 at 10:29

2 Answers 2

3

First of all, you need to concat fields.

from django.db.models import Count, Value
from django.db.models.functions import Concat

queryset = User.objects.annotate(full_name=Concat('first_name', Value(' '), 'last_name', output_field=CharField())

then you need to group and calculate the count for each unique full_name

queryset = queryset.values('full_name').annotate(full_name_count=Count('full_name'))

and just filter those with count bigger than one.

queryset = queryset.filter(full_name_count__gt=1)
Sign up to request clarification or add additional context in comments.

Comments

2

You can do this in views.py or in your django shell using py manage.py shell

from django.db.models import Count
#import your User model 

users_output = User.objects.values('first_name','last_name').annotate(first_name_count=Count('first_name'),last_name_count = Count('last_name')).filter(first_name_count__gt=1,last_name_count__gt=1)

the above query will return a query set something like this

<QuerySet
 [{'first_name': 'John', 'last_name': 'Smith', 'first_name_count': 2, 'last_name_count': 2},
 {'first_name': 'Paul', 'last_name': 'Adams', 'first_name_count': 2, 'last_name_count': 2}]
>

Comments

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.