3

How can I assemble a set of the lowest or greatest numbers in an array? For instance, if I wanted to find the lowest 10 numbers in an array of size 1000.

I'm working in C but I don't need a language specific answer. I'm just trying to figure out a way to deal with this sort of task because it's been coming up a lot lately.

2

4 Answers 4

5

QuickSelect algorithm allows to separate predefined number of the lowest and greatest numbers (without full sorting). It uses partition procedure like Quicksort algo, but stops when pivot finds needed position.

Sign up to request clarification or add additional context in comments.

3 Comments

Given that sorting an int array is O(n) (via radix-sort), and QuickSelect is O(n^2) worst case, I'd recommend sorting.
@EOF An approach might depend on conditions - if we cannot allow small probability of the worst case, we should choose another method (for example - binary heap with O(nlogk) complexity ). Note that Q/S usually is preferred sort in general case.
Another option is Introselect, which is designed explicitly to avoid QuickSelect's worst-case (but will default to QuickSelect if it can afford to).
1

Method 1: Sort the array

You can do something like a quick sort on the array and get the first 10 elements. But this is rather inefficient because you are only interested in the first 10 elements, and sorting the entire array for that is an overkill.

Method 2: Do a linear traversal and keep track of 10 elements.

int lowerTen = malloc(size_of_array);

//'array' is your array with 1000 elements
for(int i=0; i<size_of_array; i++){
    if(comesUnderLowerTen(array[i], lowerTeb)){
        addTolowerTen(array[i], lowerTen)
    }
}

int comesUnderLowerTen(int num, int *lowerTen){
    //if there are not yet 10 elements in lowerTen, insert.

    //else if 'num' is less than the largest element in lowerTen, insert.
}

void addToLowerTen(int num, int *lowerTen){
    //should make sure that num is inserted at the right place in the array
    //i.e, after inserting 'num' *lowerTen should remain sorted
}

Needless to say, this is not a working example. Also do this only if the 'lowerTen' array needs to maintain a sorted list of a small number of elements. If you need the first 500 elements in a 1000 element array, this would not be the preferred method.

Method 3: Do method 2 when you populate the original array

This works only if your original 1000 element array is populated one by one - in that case instead of doing a linear traversal on the 1000 element array you can maintain the 'lowerTen' array as the original array is being populated.

Method 4: Do not use an array

Tasks like these would be easier if you can maintain a data structure like a binary search tree based on your original array. But again, constructing a BST on your array and then finding first 10 elements would be as good as sorting the array and then doing the same. Only do this if your use case demands a search on a really large array and the data needs to be in-memory.

4 Comments

A variant of method 2 is to use a binary max-heap stored in an array to hold the smallest values (or a min-heap to store the largest values). Since the operations on the binary min-heap are O(1) or O(log n), with n being the number of values looked for (as opposed to N being the data set size), this is especially good for cases where n is much smaller than N.
a couple of problems with this line: int lowerTen = (int*)malloc(size_of_array);. 1) in C, casting the returned value is just cluttering the code. the returned type is void* which can be assigned to any other pointer, 2) malloc() returns a pointer, but int lowerTen is an integer, not a pointer.
regarding this line: for(int i=0; i<array.length; i++){. in C, arrays do not have a .length attribute.
Thanks. My c is very rusty. I should have mentioned that this is pseudo code. Made the corrections
0

Implement a priority queue. Loop through all the numbers and add them to that queue. If that queue's length would be equal to 10, start checking if the current number is lower than highest one in that queue. If yes, delete that highest number and add current one.

After all you will have a priority queue with 10 lowest numbers from your array. (Time needed should be O(n) where n is the length of your array).

If you need any more tips, add a comment :)

Comments

0

the following code

  1. cleanly compiles
  2. performs the desired functionality
  3. might not be the most efficient
  4. handles duplicates
  5. will need to be modified to handle numbers less than 0

and now the code

#include <stdlib.h>  // size_t

void selectLowest( int *sourceArray, size_t numItemsInSource, int *lowestDest, size_t numItemsInDest )
{
    size_t maxIndex = 0;
    int    maxValue = 0;

    // initially populate lowestDest array
    for( size_t i=0; i<numItemsInDest; i++ )
    {
        lowestDest[i] = sourceArray[i];
        if( maxValue < sourceArray[i] )
        {
            maxValue = sourceArray[i];
            maxIndex = i;
        }
    }

    // search rest of sourceArray and 
    // if lower than max in lowestDest, 
    // then 
    //    replace
    //    find new max value 
    for( size_t i=numItemsInDest; i<numItemsInSource; i++ )
    {
        if( maxValue > sourceArray[i] )
        {
            lowestDest[maxIndex] = sourceArray[i];

            maxIndex = 0;
            maxValue = 0;
            for( size_t j=0; j<numItemsInDest; j++ )
            {
                if( maxValue < lowestDest[j] )
                {
                    maxValue = lowestDest[j];
                    maxIndex = j;
                }
            }
        }
    }
} // end function: selectLowest

Comments

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.