Find duplicates in array when there is more than one duplicated element

Question

How can I find duplicates in array when there is more than one duplicated element?

When the array is only one duplicated element (for example: 1, 2, 3, 4, 4, 4, 5, 6, 7) then it is very easy:

int duplicate(int* a, int s)
{ 
    int x = a[0];
    for(int i = 1; i < s; ++i)
    {
        x = x ^ a[i];
    }
    for(int i = 0; i < a[s]; ++i)
    {
        x = x ^ i;
    }
    return x;
}

But if the input array contains more than one duplicated element (for example: 1, 2, 2, 2, 3, 4, 4, 4, 5, 6, 7), the above won't work. How can we solve this problem in O(n) time?

What on earth are you dong here? What are all these XOR operations for? And why the weird syntax (*(a+s-1) instead of a[s-1])? — r3mainer
– r3mainer, Commented Dec 9, 2013 at 8:52
If you are allowed to use extra space, you can build a hash-map of value to count and then check which counts are > 1. — Abhishek Bansal
– Abhishek Bansal, Commented Dec 9, 2013 at 8:53
What does function duplicate() is supposed to return? currently, you're returning this value ({XOR:a[i]} XOR 1..a[s-1]) which doesn't make sense to me — Khaled.K
– Khaled.K, Commented Dec 9, 2013 at 9:13
Khaled A Khunaifer --- certainly an element that is a duplicate, and it is written in the code. — user1886376
– user1886376, Commented Dec 9, 2013 at 9:30

MrSmith42 · Accepted Answer · 2013-12-09 08:56:39Z

1

If space is no concern or the maximal number is quite low, you can simple use a kind of a bit-array and mark all already occurred numbers by setting the bit at the position of the number.

It'a a kind of HashSet with trivial (identity) hash-function. Tests and set cost O(1) time.

answered Dec 9, 2013 at 8:56

MrSmith42

10.2k6 gold badges41 silver badges52 bronze badges

Sign up to request clarification or add additional context in comments.

Comments

Jean · Accepted Answer · 2013-12-09 09:59:55Z

Using a set is one of the possible generic solutions. Example in c++:

template <typename T>
void filter_duplicates(T* arr, int length) {
    std::unordered_set<T> set;
    for (int i = 0; i < length; ++i) {
        if (set.count(arr[i]) > 0) {
            // then it's a duplicate
        }
        set.insert(arr[i]);
    }
    // the set contains all the items, unduplicated
}

As unordered_set is implemented as a hash table, insertion and lookup are of amortized constant complexity. As a set can only contain unique keys, this effectively de-duplicates the items. We could finally convert back the set to an array. We could also use a map to count the occurrences.

If array elements are integers and that the maximum possible value is known, and fairly low, then the set can be replaced by a simple array either 1. of boolean or 2. of integer if we want to count the number of occurrences.

Collectives™ on Stack Overflow

Find duplicates in array when there is more than one duplicated element

2 Answers 2

Comments

Comments

Your Answer

Hot Network Questions

Collectives™ on Stack Overflow

2 Answers 2

Comments

Comments

Your Answer

Sign up or log in

Post as a guest

Related