algorithm for finding longest sequence of the same element in 1D array-looking for better solution

Question

I need to find algorithm which will find the longest seqeunce of element in one dimension array.

For example:

int[] myArr={1,1,1,3,4,5,5,5,5,5,3,3,4,3,3}

solution will be 5 because sequnece of 5 is the longest. This is my solution of the problem:

static int findSequence(int [] arr, int arrLength){

    int frequency=1;
    int bestNumber=arr[0];
    int bestFrequency=0;

    for(int n=1;n<arrLength;n++){
        if(arr[n]!=arr[n-1]){ 
            if(frequency>bestFrequency){
                bestNumber=arr[n-1];
                bestFrequency=frequency;
            }
            frequency=1;
        }else { 
            frequency++;
        }
    }

    if( frequency>bestFrequency){
        bestNumber=arr[arrLength-1];
        bestFrequency=frequency;
    }

    return bestNumber;
}

but I'm not satisfied.May be some one know more effective solution?

Wait, so you want a more efficient solution than checking every element once (which you obviously need to do)? — Bernhard Barker
– Bernhard Barker, Commented Nov 20, 2013 at 20:43
There are some optimizations if the data has some patterns, but generally linear time is the best worst case speed you can do. — zw324
– zw324, Commented Nov 20, 2013 at 20:48
I looking for other solutions which will be more effective or less complex than my. I hope that some one better then I show me other way to solve this problem. — Krzysztof
– Krzysztof, Commented Nov 20, 2013 at 20:49
Maybe someone can work this into a more complex solution. Suppose I just determined that the sequence starts with 1 million "0"'s followed by a "1". If I look at the 2 millionth element, and it is not a "1", then I know I can safely ignore all of the elements between 1,000,001 and 2,000,000. On the other hand, if it was a "1", I would investigate that region some more possibly starting at its middle, just in case it contained only 1's. — Xantix
– Xantix, Commented Nov 20, 2013 at 21:51

BlackJack · Accepted Answer · 2013-11-20 22:18:27Z

2

You can skip the some number in the array in the following pattern:

Maintain a integer jump_count to maintain the number of elements to skip (which will be bestFrequency/2). The divisor 2 can be changed according to the data set. Update the jump_count every time you update the bestFrequency.

Now, after every jump

If previous element is not equal to current element and frequency <= jump_count, then scan backwards from current element to find number of duplicates and update the frequency.

e.g. 2 2 2 2 3 3 and frequency = 0 (bold are previous and current elements), then scan backwards to find number of 3's and update the frequency = 2
If previous element is not equal to current element and frequency > jump_count, scan for scan for every element to update the frequency and update the bestFrequency if needed.

e.g. 2 2 2 2 2 3 3 and frequency = 1 (bold are previous and current elements), scan for number of 2's in this jump and update the frequency = 1 + 4. Now, frequency < bestFrequency, scan backwards to find number of 3's and update the frequency = 2.
If previous element = current element, scan the jump to make sure it is continuous sequence. If yes, update the frequency to frequency + jump_count, else consider this as the same case as step 2.

Here, we will consider two examples:

a) 2 2 2 2 2 2 (bold are previous and current elements), check if the jump contains all 2's. Yes in this case, so add the jump_count to frequency.

b) 2 2 2 2 3 2 (bold are previous and current elements), check if the jump contains all the 2's. No in this case, so considering this as in step 2. So, scan for number of 2's in this jump and update the frequency = 1 + 4. Now, frequency < bestFrequency, scan backwards to find number of 2's(from the current element) and update the frequency = 1.

Optimization: You can save some loops in many cases.

P.S. Since this is my first answer, I hope I am able to convey myself.

edited Nov 20, 2013 at 22:18

answered Nov 20, 2013 at 21:37

BlackJack

931 silver badge7 bronze badges

Sign up to request clarification or add additional context in comments.

2 Comments

Jim Mischel Over a year ago

In your case #1, you still have to back up and count the 3's. Steps 2 and 3 require that you scan the entire jump. So case 1 is the only potential savings. It could be significant, but I don't know if it's enough in the general case to compensate for the increased implementation complexity.

BlackJack Over a year ago

Ya, agreed that it will run in O(n) time in cases, but then you can't optimize it only if you have some special test cases to run on. Else it will take O(n) time in worst case anyways. Also, corrected the error of case 1, thanks for pointing out :)

Dan Lowe · Accepted Answer · 2015-12-13 06:07:04Z

2

Try this:

public static void longestSequence(int[] a) {
    int count = 1, max = 1;

    for (int i = 1; i < a.length; i++) {
        if (a[i] == a[i - 1]) {
            count++;
        } else {
            if (count > max) {
                max = count;
            }
            count = 1;
        }
    }

    if (count> max)
        System.out.println(count);
            else
        System.out.println(max);    
}

edited Dec 13, 2015 at 6:07

Dan Lowe

57.6k20 gold badges133 silver badges115 bronze badges

answered Nov 20, 2013 at 21:02

kedark

2461 silver badge8 bronze badges

3 Comments

Krzysztof Over a year ago

I'm afraid that it's not working :( , for int[] arr={1,1,4,4,4,4,4,6,6,7,7,7,7,7,7,7,7}; it gives me 5 as a result

templatetypedef Over a year ago

You need to update the max after the for loop finishes in case the longest stretch ends at the end of the array.

kedark Over a year ago

you are right i missed one condition. i corrected with last statement.

Colin D · Accepted Answer · 2013-11-20 21:19:59Z

1

Your algorithm is pretty good.

It touches each array element (except the last) only once. This puts it at O(n) runtime which for this problem seems like the best worst case runtime you can get and is a pretty good worst case runtime as far as algorithms go.

One possible suggestion is when you find a new bestFrequency and n+bestFrequency > arrayLength you can break out of the loops. This is because you know a longer sequence cannot be found.

answered Nov 20, 2013 at 21:19

Colin D

5,6611 gold badge27 silver badges38 bronze badges

Comments

user1676075 · Accepted Answer · 2013-11-20 21:05:02Z

0

The only optimization that seems possible is:

for(int n=1;n<arrLength && frequency + (arrLength - n) >= bestFrequency;n++){

because you don't need to search any further one you can't possible exceed the best frequency with the number of elements remaining (probably possible to simplify that even further given a little more thought).

But as others point out, you're doing a O(n) search on n elements for a sequence - there's really no more efficient algorithm available.

answered Nov 20, 2013 at 21:05

user1676075

3,0961 gold badge22 silver badges26 bronze badges

3 Comments

Marko Topolnik Over a year ago

but now you have added a quite heavy calculation to each step through the array. This improves some cases and hurts others, so isn't really an optimization, just another tradeoff.

Marko Topolnik Over a year ago

what you should have done instead is update the arrLength variable each time you find a longer sequence.

user1676075 Over a year ago

Agreed, that would be better (store a temp "end length" that is updated to be arrLength-bestFrequency every time bestFrequency changes).

user1676075 · Accepted Answer · 2013-11-21 13:16:06Z

I was thinking this must be an O(n) problem, but now I'm wondering if it doesn't have to be, that you could potentially make it O(log n) using a binary search (I don't think what @BlackJack posted actually works quite right, but it was inspiring):

Was thinking something like keep track of first, last element (in a block, probably a recursive algorithm). Do a binary split (so middle element to start). If it matches either first or last, you possibly have a run of at least that length. Check if the total length could exceed the current known max run. If so, continue, if not break.

Then repeat the process - do a binary split of one of those halves to see if the middle item matches. Repeat this process, recursing up and down to get the maximum length of a single run within a branch. Stop searching a branch when it can't possibly exceed the maximum run length.

I think this still comes out to be an O(n) algorithm because the worth-case is still searching every single element (consider a max length of 1 or 2). But you limit to checking each item once, and you search into the most-likely longest branches first (based on start/middle/end matches), it could potentially skip some fairly long runs. A breadth-first rather than depth-first search would also help.

Collectives™ on Stack Overflow

algorithm for finding longest sequence of the same element in 1D array-looking for better solution

5 Answers 5

2 Comments

3 Comments

Comments

3 Comments

Comments

Your Answer

Hot Network Questions

Collectives™ on Stack Overflow

5 Answers 5

2 Comments

3 Comments

Comments

3 Comments

Comments

Your Answer

Sign up or log in

Post as a guest

Related