Deleting duplicates in array in c++

Question

How do I delete duplicate elements in an array without using any other type of data structure?

I'm just having a hard time shifting elements. Please help!!

For example if I had this array: string arr[] = {"helo", "helo", "dog"}

how do I get it to be {"helo", "dog"}?

Is the array sorted or unsorted? Do you need to maintain any sorting? Do you need to maintain stability? The simplest way to remove an entry from an array but keep it tightly packed is: arr[replace_ind] = arr[--arr_size]; — jschultz410
– jschultz410, Commented Feb 22, 2015 at 1:24

4pie0 · Accepted Answer · 2015-02-22 18:35:55Z

In C++ you can use unique function template defined in namespace std. This applied to sorted container will remove duplicates and return iterator to the end of unique sequence.

#include <iostream>
#include <algorithm>
#include <vector>

int main() 
{
    std::vector<std::string> v{ "helo", "helo", "dog" };
    std::sort( v.begin(), v.end() );
    std::vector<std::string>::iterator last;

    last = std::unique( v.begin(), v.end()); // "dog", "helo", "helo" 
                                            //                  ^
    for ( std::vector<std::string>::iterator it = v.begin(); it != last; ++it) {
        std::cout << *it << " ";
    }
    std::cout << "\n"; // output: dog helo
}

This template function will work also given a pointers to ordinary array as begin and end iterators (pointers are iterators, iterator is abstract notion). Be careful however to thoroughly understand what std::unique() does in detail - as you can see elements of container have to be first sorted to achieve what you expect and size of container is not changed by unique.

Nothing prevents you also from writing your own unique:

template<class ForwardIt>
ForwardIt unique(ForwardIt first, ForwardIt last)
{
    if (first == last)
        return last;

    ForwardIt result = first;
    while (++first != last) {
        if (!(*result == *first)) {
            *(++result) = std::move(*first);
        }
    }
    return ++result;
}

If you want to remove the remainder elements, these that follows unique elements, you can erase them:

std::sort( v.begin(), v.end() );
v.erase( unique( v.begin(), v.end() ), v.end() );

yzt · Accepted Answer · 2015-02-22 00:24:23Z

Here's how I would do it for an unsorted array that I didn't want to reorder:

// returns how many elements were *removed*
template <typename T>
int RemoveDups (T a [], int n)
{
    int shift = 0;
    for (int i = 1; i < n; ++i)
    {
        int j = i - 1 - shift;
        for ( ; j >= 0; --j)
            if (a[i] == a[j])
                break;
        if (j < 0)  // not a duplicate
            a[i - shift] = std::move(a[i]);
        else
            shift += 1;
    }
    return shift;
}

The way I accept an array as input is not the best way; use an std::array or a pair of iterator-like objects instead. But I wanted the simplest answer.

Here's a description of what's going on:

We maintain a variable shift, which always tracks how many slots each element needs to be moved back, or equivalently, how many elements have already been removed before the current one.

Starting from the second element (because obviously the first element is not duplicate of anything,) we compare ith element with all the elements before it. If this element is not equivalent to any of them, then this is a "good" element and needs to be kept, so we move it shift spots back. Otherwise, we leave it where it is to be overwritten later (or not; we don't care) and just increment shift (because now there is one more element that has been removed and needs to be stepped over.)

This comparison of the ith element with all the previous ones starts from shift positions before i and goes backwards, because we know that the previous shift elements are to be removed and aren't needed (and all the "good" elements among them have already been moved back.)

That's it. The time complexity of this algorithm is O(n^2) and it is stable.

If you know that elements are already sorted, you can only compare each element with the one exactly shift - 1 spots before it, and be done in O(n).

If you don't mind reordering your data, you can first sort them in O(n*log(n)) and then use the O(n) method for a total of O(n*log(n)).

Collectives™ on Stack Overflow

Deleting duplicates in array in c++

2 Answers 2

Comments

Comments

Your Answer

Hot Network Questions

Collectives™ on Stack Overflow

2 Answers 2

Comments

Comments

Your Answer

Sign up or log in

Post as a guest

Related