Removing duplicates from a Array [duplicate]

Question

enter image description here

I am dealing with the following problem from my data structure book. I have come up with the solution as suggested by this text . I basically found all the duplicates and labelled them some arbitrary number like 666 and then removed them from the array.

My question to everyone is - is my solution is exactly as the text suggested? - also what is a more effective method to solve this problem?

Here is the complete code (LOOK at the nodups method to see my solution)

public class HighArray {

    private long[] a;
    private int nElems;

    public HighArray(int max) {

        a = new long[max];
        nElems = 0;
    }

    public boolean find(long searchKey) {
        int j;
        for (j = 0; j < nElems; j++)
            if (a[j] == searchKey)
                break;

        if (j == nElems) {
            return false;
        }

        else {
            return true;
        }
    }

    public void insert(long value) {

        a[nElems] = value;
        nElems++;
    }

    public void noDups() {
        int i = 0;
        long compareKey;

        while (i < nElems) {

            compareKey = a[i];

            for (int j = 0; j < nElems; j++) {
                if (j != i && a[j] != 666) {
                    if (a[j] == compareKey) {
                        a[j] = 666;
                    }
                }
                j++;
            }

            i++;
        }

        for (int k = 0; k < nElems; k++) {
            if (a[k] == 666) {
                delete(a[k]);
            }
        }

    }

    public boolean delete(long value) {

        int j;
        for (j = 0; j < nElems; j++)
            if (a[j] == value)
                break;

        if (j == nElems) {
            return false;
        }

        else {
            for (int k = j; k < nElems - 1; k++)
                a[k] = a[k + 1];
            nElems--;
            return true;
        }
    }

    public long removeMax() {

        if (nElems != 0) {
            long maxValue = a[0];

            for (int i = 0; i < nElems; i++) {
                if (a[i] > maxValue)
                    maxValue = a[i];
            }

            delete(maxValue);
            return maxValue;
        }

        else {
            return -1;
        }
    }

    public void display() {

        for (int i = 0; i < nElems; i++) {
            System.out.println(a[i]);
        }
    }

}

The following class has main method.

public class HighArrayApp {

    public static void main(String[] args) {

        HighArray arr = new HighArray(100);

        arr.insert(2);
        arr.insert(2);
        arr.insert(3);
        arr.insert(4);
        arr.insert(4);
        arr.insert(5);

        arr.display();

        arr.noDups();

        System.out.println("-------------------------");

        arr.display();

    }

}

I highly appreciate any suggestions and i am open to all kinds of approaches that attempt a more effective algorithm for this problem.

You can add elements of long[] to a Set (HashSet) and convert the HashSet to array to eliminate duplicates — suman j
– suman j, Commented Jul 21, 2014 at 22:32
Does it work? Make sure you test some cases where there are three or more duplicates right next to each other in the array. I suspect you will run into problems. — ajb
– ajb, Commented Jul 21, 2014 at 22:37
@user2733436 would you like to see a lambda expression of what you try to achieve? — Kick Buttowski
– Kick Buttowski, Commented Jul 21, 2014 at 22:38
@jack while that would work it defeats the point of learning algorithms to remove duplicates as the Set Does the work. — twain249
– twain249, Commented Jul 21, 2014 at 22:39
This algorithm is not correct if the array happens to contain 666 already. To fix this, test the compareKey for being 666 and set a flag to true if you find it, and then loop for (int j = i+1; j < nElems; j++). Then, when you remove the "666" entries, skip the first if the flag is true. Of course kruemel's solution is algorithmically faster and is probably the desired solution. — dbc
– dbc, Commented Jul 21, 2014 at 22:43

thertweck · Accepted Answer · 2014-07-21 23:11:13Z

4

You solution is valid but as you said, I think there is a more efficient approach. I also think the given text implies this ("One approach is ...", "Another approach is...").

Comparing each element with the others is O(n^2).

If you sort the array first, you can remove duplicates with a single walk through the array.

Sorting is O(n log n), walking through is O(n).

The total complexity is then O( n log n) + O(n) = O(n log n).

This solution is valid, as the text clearly states that the order of the objects may be changed.

edited Jul 21, 2014 at 23:11

answered Jul 21, 2014 at 22:35

thertweck

1,1208 silver badges24 bronze badges

Sign up to request clarification or add additional context in comments.

4 Comments

takendarkk Over a year ago

I'm not good with time complexity so maybe you could clear this up for me. How can you take O(n log n) then add something to it, and still have the result be only O(n log n)?

thertweck Over a year ago

This is how addition in Landau notation works. It is obvious if you think about n being really large. In O(n log n) + O(n), the O(n log n) part will grow much faster, making O(n) irrelevant.

user1010101 Over a year ago

So, once i sort the array or have a sorted array, and remove the duplicated my complexity would be O(n) ? I can see that the solution i posted is n^2 since i have nested loops.

user1010101 Over a year ago

@kruemel hey i just posted another question is it possible for you to take a look?

Kick Buttowski · Accepted Answer · 2014-07-21 22:46:39Z

3

You can through this with less code by using Lambda Expression

Code:

public class LambdaTest     
{
    public static void main (String[] args)
    {

         List<Integer> objList = Arrays.asList(1,1,2,3,2,4,5,2,5);
         objList .forEach(i -> System.out.print(" " + i));
         System.out.println();
         List<Integer> noDub =objList.stream().distinct().collect(Collectors.toList());
         noDub.forEach(i -> System.out.print(" " + i));
    } 
}

output:

1 1 2 3 2 4 5 2 5 

1 2 3 4 5

answered Jul 21, 2014 at 22:46

Kick Buttowski

6,73713 gold badges39 silver badges61 bronze badges

4 Comments

user1010101 Over a year ago

are you using some kind of built in function. collectors is built in to java if i am not mistaken?

Kick Buttowski Over a year ago

all new functions from java 8. if u r interested in, I would be more than happy to share you what I have figured out so far :)

user1010101 Over a year ago

Yeah i would definitely be interested . In my solution i did not use any built in function as the text suggested otherwise. :) upvoted u already

Kick Buttowski Over a year ago

which is impressive to me. I'll study your solution for sure cuz I want to know what's going on in beneath of set function. my email is in my page. +1 u already too haha sounds kind of funny but seriously I did

turingcomplete · Accepted Answer · 2014-07-21 23:21:47Z

2

You can make use of better and faster data structures for this. Why not use a HashSet?

Example

import java.util.*;

public class Test
{
    public static void main(String[]args)
    {
        Integer [] arr = new Integer[]{4, 3, 1, 2, 4, 3, 2};
        Set<Integer> hs = new HashSet<Integer>(Arrays.asList(arr));
        System.out.println(hs);
    }
}

edited Jul 21, 2014 at 23:21

answered Jul 21, 2014 at 22:38

turingcomplete

2,2084 gold badges18 silver badges27 bronze badges

3 Comments

ChiefTwoPencils Over a year ago

Ugh, you'd think someone might add a couple lines of example code. This should be a comment in its present form.

user1010101 Over a year ago

Well i have not learned about HashSet yet...that's why i used this approach

Kick Buttowski Over a year ago

@user2733436 my sample is much easier than that. if you don't understand it lemme know so I can through more explanation for you

Leonard Brünings · Accepted Answer · 2014-07-21 22:50:01Z

1

The algorithm described in the book is akin to a bubble sort. The easiest way is to do this is to use two nested loops.


for (int i=0; i < a.length;i++) {
    long ref = a[i];
    for (int j=i+1; j < a.length; j++) {
        if(a[j] == ref) {
            a[j] = Long.MIN_VALUE;
        }
    }
}

I've left out the cleanup part.

answered Jul 21, 2014 at 22:50

Leonard Brünings

13.4k1 gold badge55 silver badges75 bronze badges

Comments

Caleb An · Accepted Answer · 2014-07-21 22:55:16Z

Your function performs the way the text implies, which unfortunately is a horrible way to do it.

First, you are assuming 666 won't be a possible value, which may be false, and in design even if it is temporarily true that may change with future changes to your program.

Second, the HighArray class should not be keeping an array of longs if you expect to be adding and deleting from the list. An ArrayList or your own implementation of a linked list would be more appropriate, since the order of your items matters.

If you must use the given HighArray class, then the simplest method would to be to convert the array a into a HashSet, which will keep track of all unique values. Then turn the HashSet back into an array.

Set<Long> uniqueNumbers = new HashSet<Long>(Arrays.asList(a));
a = uniqueNumbers.toArray(new long[uniqueNumbers.size()]);

Creating and managing the set is O(n lg(n)) overall complexity, and converting it back to an array is O(n) which is much better than the O(n^2) complexity of your original method.

Collectives™ on Stack Overflow

Removing duplicates from a Array [duplicate]

5 Answers 5

4 Comments

4 Comments

3 Comments

Comments

Comments

Linked

Hot Network Questions

Collectives™ on Stack Overflow

5 Answers 5

4 Comments

4 Comments

3 Comments

Comments

Comments

Linked

Related