0

How do I delete duplicate elements in an array without using any other type of data structure?

I'm just having a hard time shifting elements. Please help!!

For example if I had this array: string arr[] = {"helo", "helo", "dog"}

how do I get it to be {"helo", "dog"}?

1
  • Is the array sorted or unsorted? Do you need to maintain any sorting? Do you need to maintain stability? The simplest way to remove an entry from an array but keep it tightly packed is: arr[replace_ind] = arr[--arr_size]; Commented Feb 22, 2015 at 1:24

2 Answers 2

2

In C++ you can use unique function template defined in namespace std. This applied to sorted container will remove duplicates and return iterator to the end of unique sequence.

#include <iostream>
#include <algorithm>
#include <vector>

int main() 
{
    std::vector<std::string> v{ "helo", "helo", "dog" };
    std::sort( v.begin(), v.end() );
    std::vector<std::string>::iterator last;

    last = std::unique( v.begin(), v.end()); // "dog", "helo", "helo" 
                                            //                  ^
    for ( std::vector<std::string>::iterator it = v.begin(); it != last; ++it) {
        std::cout << *it << " ";
    }
    std::cout << "\n"; // output: dog helo
}

This template function will work also given a pointers to ordinary array as begin and end iterators (pointers are iterators, iterator is abstract notion). Be careful however to thoroughly understand what std::unique() does in detail - as you can see elements of container have to be first sorted to achieve what you expect and size of container is not changed by unique.

Nothing prevents you also from writing your own unique:

template<class ForwardIt>
ForwardIt unique(ForwardIt first, ForwardIt last)
{
    if (first == last)
        return last;

    ForwardIt result = first;
    while (++first != last) {
        if (!(*result == *first)) {
            *(++result) = std::move(*first);
        }
    }
    return ++result;
}

If you want to remove the remainder elements, these that follows unique elements, you can erase them:

std::sort( v.begin(), v.end() );
v.erase( unique( v.begin(), v.end() ), v.end() );
Sign up to request clarification or add additional context in comments.

Comments

0

Here's how I would do it for an unsorted array that I didn't want to reorder:

// returns how many elements were *removed*
template <typename T>
int RemoveDups (T a [], int n)
{
    int shift = 0;
    for (int i = 1; i < n; ++i)
    {
        int j = i - 1 - shift;
        for ( ; j >= 0; --j)
            if (a[i] == a[j])
                break;
        if (j < 0)  // not a duplicate
            a[i - shift] = std::move(a[i]);
        else
            shift += 1;
    }
    return shift;
}

The way I accept an array as input is not the best way; use an std::array or a pair of iterator-like objects instead. But I wanted the simplest answer.

Here's a description of what's going on:

We maintain a variable shift, which always tracks how many slots each element needs to be moved back, or equivalently, how many elements have already been removed before the current one.

Starting from the second element (because obviously the first element is not duplicate of anything,) we compare ith element with all the elements before it. If this element is not equivalent to any of them, then this is a "good" element and needs to be kept, so we move it shift spots back. Otherwise, we leave it where it is to be overwritten later (or not; we don't care) and just increment shift (because now there is one more element that has been removed and needs to be stepped over.)

This comparison of the ith element with all the previous ones starts from shift positions before i and goes backwards, because we know that the previous shift elements are to be removed and aren't needed (and all the "good" elements among them have already been moved back.)

That's it. The time complexity of this algorithm is O(n^2) and it is stable.

If you know that elements are already sorted, you can only compare each element with the one exactly shift - 1 spots before it, and be done in O(n).

If you don't mind reordering your data, you can first sort them in O(n*log(n)) and then use the O(n) method for a total of O(n*log(n)).

Comments

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.