Removing Duplicate Integers in an ArrayList

Question

I see a lot of posts with the same topic (mostly with Strings) but haven't found the answer to my question. How would I remove duplicate integers from an ArrayList?

import java.util.*;

public class ArrayList2 {
    public static ArrayList<Integer> removeAllDuplicates(ArrayList<Integer> list) {
        Collections.sort(list);
        for (int i = 0; i < list.size(); i++) {
            if (list.get(i) == list.get(i + 1)) {
                list.remove(i);
            }
        }
    return list;
    }
}

This is the start of my code, the only problem that has arised is that if there are 3 integers with the same value, it only removes one of them. If I put in 4, it removes two of them. PLEASE NO HASHING!!!

The ArrayList and the output when I run it:

List: [-13, -13, -6, -3, 0, 1, 1, 1, 5, 7, 9]
Duplicates Removed: [-13, -6, -3, 0, 1, 1, 5, 7, 9]

This is my first time using this website, so please let me know if I'm doing something wrong with formatting/if there's already an answer to my question that I missed.

Would it be more efficient to use a set in this situation? Also, @Micha Wiedenmann, I'll be sure to do that. — Excel
– Excel, Commented Nov 8, 2018 at 23:22
@Excel A Set guarantees uniqueness, so yeah, probably - but I'd avoid been tempted to micro-optimise the solution at this point. It would be simpler, easier and more likely to work — MadProgrammer
– MadProgrammer, Commented Nov 8, 2018 at 23:23
new HashSet<>(list) or list.stream().distinct().collect(Collectors.toList()) — Kartik
– Kartik, Commented Nov 8, 2018 at 23:24

mrshl · Accepted Answer · 2018-11-08 23:28:26Z

3

The specific reason why your removeAllDuplicates function doesn't work is that you are still iterating after a successful comparison. If you iterate only when list.get(i) != list.get(i + 1), you will get rid of all the duplicates.

public static ArrayList<Integer> removeAllDuplicates(ArrayList<Integer> list) {
    Collections.sort(list);
    int i = 0;
    while(i < list.size() - 1) {
        if (list.get(i) == list.get(i + 1)) {
            list.remove(i);
        } else {
            i++;
        }
    }
    return list;
}

It's worth noting that the above function is not as fast as it could be. Though the iteration runs quickly enough, the most significant step will be the sort operation (O(n log n)).

To avoid this extra time complexity, consider using a HashSet instead of an ArrayList (if it still fits within the constraints of your problem).

answered Nov 8, 2018 at 23:28

mrshl

5313 silver badges12 bronze badges

Sign up to request clarification or add additional context in comments.

1 Comment

Excel Over a year ago

Checks out! Thank you, I've been stuck on this for a quick minute.

3 revs · Accepted Answer · 2018-11-08 23:42:41Z

2

Other people have "answered" the basic issue, of the if statement skipping over elements because the for-loop is incrementing the index position, while the size of the array is shrinking.

This is just "another" possible solution. Personally, I don't like mutating Lists in loops and prefer to use iterators, something like...

Collections.sort(list);
Iterator<Integer> it = list.iterator();
Integer last = null;
while (it.hasNext()) {
  Integer next = it.next();
  if (next == last) {
    it.remove();
  }
  last = next;
}

Still, I think some kind of Set would be a simpler and easier solution (and since you'd not have to sort the list, more efficient ;))

edited Nov 8, 2018 at 23:42

community wiki

3 revs
MadProgrammer

4 Comments

mrshl Over a year ago

Whether you use an iterator or not, are you not still mutating the array within a loop?

MadProgrammer Over a year ago

@mettleap Yes, sorry, didn't put that in :P

MadProgrammer Over a year ago

@unmrshl Not sure what you mean. The iterator is a proxy over the List, it just helps remove the possible "mutation" issues around manipulating Lists within arrays ... or am I thinking of some other constraint. In any case, no forgetting to update the index value or weird manipulation logic for the index. It's just "another" way to achieve the result, which for me, is a little cleaner. (ps - I'm think of for-each loops :P)

mettleap Over a year ago

You're welcome :) ... without a data structure like a Set, this is pretty efficient

rgettman · Accepted Answer · 2018-11-08 23:26:32Z

0

This assumes that you cannot use a set ("Please No hashing!!!") for a reason such as homework.

Look what happens when you remove a duplicate. Let's say that the dupe you're removing is the 1st of 3 consecutive equal numbers.

                v
-13, -6, -3, 0, 1, 1, 1, 5, 7, 9

Here i is 4 to refer to the first of the 3 1 values. When you remove it, all subsequent elements get shifted down, so that the second 1 value now takes the place of the first 1 value at index i = 4. Then the current iteration ends, another one begins, and i is now 5.

                   v
-13, -6, -3, 0, 1, 1, 5, 7, 9

But now i is referring to the second of the two 1 values left, and the next element 5 isn't equal, so no dupe is found.

Every time you find a duplicate, you must decrease i by 1 so that you can stay on the first element that is a duplicate, to catch 3 or more duplicates in a row.

                v
-13, -6, -3, 0, 1, 1, 5, 7, 9

Now consecutive elements still match, and another 1 value will get removed.

answered Nov 8, 2018 at 23:26

rgettman

179k30 gold badges282 silver badges365 bronze badges

3 Comments

Excel Over a year ago

Would a simple "i--;" in the if statement fix the code? I'm assuming this is what you're saying.

MadProgrammer Over a year ago

@Excel You "really" want to avoid messing with the iterator of a for-loop

Makoto Over a year ago

@Excel: It would, but there'd be a bit more work to do...it's also a faux-pas to mess with the iterator of a loop as MadProgrammer alluded to.

Bhagyesh · Accepted Answer · 2018-11-08 23:30:45Z

0

You'd want to have a double for loop as you dont want to check the only the next index but all indexes. As well as remember when you remove a value you want to check that removed value again as it could of been replaced with another duplicate value.

answered Nov 8, 2018 at 23:30

Bhagyesh

7001 gold badge11 silver badges37 bronze badges

1 Comment

mrshl Over a year ago

This would be true if he were not sorting the array. Since it is sorted, the only potential duplicates will be at an index greater than i, which can be handled by simply not iterating after each removal

Collectives™ on Stack Overflow

Removing Duplicate Integers in an ArrayList

4 Answers 4

1 Comment

4 Comments

3 Comments

1 Comment

Your Answer

Hot Network Questions

Collectives™ on Stack Overflow

4 Answers 4

1 Comment

4 Comments

3 Comments

1 Comment

Your Answer

Sign up or log in

Post as a guest

Related