Remove Duplicates in list in python

Question

I have a dynamic list :

[{'dashboard': 'AG', 'end_date': '2021-06-17 13:13:43', 'location': 'EC & pH Reading', 'zone_name': 'Zone 1 Left'}, 

{'dashboard': 'AG', 'end_date': '2021-06-17 12:40:06', 'location': 'Harvest', 'zone_name': 'Zone 2 Left'}, 

{'dashboard': 'AG', 'end_date': '2021-06-16 15:52:52', 'location': 'Harvest', 'zone_name': 'Zone 1 Left' }, 

{'dashboard': 'AG', 'end_date': '2021-06-16 15:45:51', 'location': 'Harvest', 'zone_name': 'Zone 1 Left'}]

I want to remove the duplicates based on zone_name and location. There are 3 values in zone_name. I want to remove the old one. I have sorted using the end_date. That is latest will come at top. Now i need to remove the duplicate value based on zone_name and location.

This is what i have tried:

final_zone = []
res_list = []
for i in sortedArray:
     if i["location"] not in final_zone:
          sch.append(i)
          final_zone.append(i["location"])

What change i need to do to remove the duplicate based on zone_name and location.

That is in zone 1 left , there are 3 values, i need the latest one

Latest one. I have sorted that by end_date

karthik
– karthik

2021-06-17 08:44:49 +00:00
Commented Jun 17, 2021 at 8:44 — karthik
– karthik, Commented Jun 17, 2021 at 8:44

user2390182 · Accepted Answer · 2021-06-17 09:01:05Z

1

For a general approach with an unsorted list:

from itertools import groupby
from operator import itemgetter

# sorting and grouping functions
f_sort = itemgetter("location", "zone_name", "end_date")  # sort by descending
f_group = itemgetter("location", "zone_name")  # group sorted by

result = [
    next(g) for _, g in  # only take latest of each group
    groupby(sorted(array, key=f_sort, reverse=True), key=f_group)
]

Here is some documentation on the used utils (all of which are really handy in a lot of use cases):

edited Jun 17, 2021 at 9:01

answered Jun 17, 2021 at 8:55

user2390182

73.7k6 gold badges71 silver badges95 bronze badges

Sign up to request clarification or add additional context in comments.

Comments

Stefano Fiorucci - anakin87 · Accepted Answer · 2021-06-17 08:50:45Z

0

clean_list=[]

for elem in lst:
    # control if an element with the same zone name and location
    # is yet present in the clean list
    yet_present= len([el for el in clean_list
                if el['zone_name']==elem['zone_name']
                if el['location']==elem['location']])>0
    if not yet_present:
        clean_list.append(elem)

OUTPUT:

[{'dashboard': 'AG',
  'end_date': '2021-06-17 13:13:43',
  'location': 'EC & pH Reading',
  'zone_name': 'Zone 1 Left'},
 {'dashboard': 'AG',
  'end_date': '2021-06-17 12:40:06',
  'location': 'Harvest',
  'zone_name': 'Zone 2 Left'},
 {'dashboard': 'AG',
  'end_date': '2021-06-16 15:52:52',
  'location': 'Harvest',
  'zone_name': 'Zone 1 Left'}]

answered Jun 17, 2021 at 8:50

Stefano Fiorucci - anakin87

3,57610 silver badges32 bronze badges

4 Comments

karthik Over a year ago

You have saved my day. Thanks

Stefano Fiorucci - anakin87 Over a year ago

If my answer is useful, please upvote and/or accept it.

karthik Over a year ago

Need 15 reputation to upvote. i dont have that, so i cant able to do

Stefano Fiorucci - anakin87 Over a year ago

You can accept it (instructions: meta.stackexchange.com/a/5235/645001)

ThePyGuy · Accepted Answer · 2021-06-17 08:49:07Z

0

Create a variable result, and for each dictionary item in the data list, check if its already there in the result, if yes don't append, else append it to the result list.

result = []
for item in data:
    if item['zone_name'] in (x['zone_name'] for x in result):
        continue
    result.append(item)

OUTPUT:

[{'dashboard': 'AG',
  'end_date': '2021-06-17 13:13:43',
  'location': 'EC & pH Reading',
  'zone_name': 'Zone 1 Left'},
 {'dashboard': 'AG',
  'end_date': '2021-06-17 12:40:06',
  'location': 'Harvest',
  'zone_name': 'Zone 2 Left'}]

answered Jun 17, 2021 at 8:49

ThePyGuy

18.5k5 gold badges24 silver badges55 bronze badges

Comments

Martin Wettstein · Accepted Answer · 2021-06-17 08:49:18Z

0

You can just loop through the list and memorize the indices you want to keep.

keepers = {}
for i in range(len(sorted_array)):
    keepers(sorted_array[i]['location'])=i ## Will be overwritten if the zone_name repeats

final_array = []
for i in keepers.values():
    final_array.append(sorted_array[i])

As a bonus, you get a list of all zones in keepers.keys().

But your approach might actually also work. Just change sch.append(i) to res_list.append(i) and change the order of the iterable (for i in sorted_array[::-1]), so the last and not the first one gets kept.

answered Jun 17, 2021 at 8:49

Martin Wettstein

2,9022 gold badges13 silver badges19 bronze badges

Comments

Tom McLean · Accepted Answer · 2021-06-17 08:54:42Z

The other answers work but I want to add a solution using Pandas

you can create a dataframe from your list of dictionaries:

import pandas as pd
d = [{'dashboard': 'AG', 'end_date': '2021-06-17 13:13:43', 'location': 'EC & pH Reading', 'zone_name': 'Zone 1 Left'}, {'dashboard': 'AG', 'end_date': '2021-06-17 12:40:06', 'location': 'Harvest', 'zone_name': 'Zone 2 Left'}, 

{'dashboard': 'AG', 'end_date': '2021-06-16 15:52:52', 'location': 'Harvest', 'zone_name': 'Zone 1 Left' }, 

{'dashboard': 'AG', 'end_date': '2021-06-16 15:45:51', 'location': 'Harvest', 'zone_name': 'Zone 1 Left'}]
df = pd.DataFrame(d)

This is what df looks like:

dashboard             end_date         location    zone_name
0        AG  2021-06-17 13:13:43  EC & pH Reading  Zone 1 Left
1        AG  2021-06-17 12:40:06          Harvest  Zone 2 Left
2        AG  2021-06-16 15:52:52          Harvest  Zone 1 Left
3        AG  2021-06-16 15:45:51          Harvest  Zone 1 Left

Sort of like a table in excel.

Now with one line, you can do exactly what you want:

df.sort_by("end_date").drop_duplicates(["location", "zone_name"], keep="last")

output:

  dashboard             end_date         location    zone_name
2        AG  2021-06-16 15:52:52          Harvest  Zone 1 Left
1        AG  2021-06-17 12:40:06          Harvest  Zone 2 Left
0        AG  2021-06-17 13:13:43  EC & pH Reading  Zone 1 Left

Collectives™ on Stack Overflow

Remove Duplicates in list in python

5 Answers 5

Comments

4 Comments

Comments

Comments

Comments

Your Answer

Hot Network Questions

Collectives™ on Stack Overflow

5 Answers 5

Comments

4 Comments

Comments

Comments

Comments

Your Answer

Sign up or log in

Post as a guest

Related