0

So I have a quick sort algorithm that I want to run in two different threads, the idea is to have 2 independent halves of the array being sorted at once (this should not be a problem by quick sort nature of partition).

My code is as follows:

#include <stdio.h>
#include <stdlib.h>
#include <pthread.h>

void troca (int *v, int i, int j);
int partition (int *v, int ini, int fim);
void quicksort (int *v, int ini, int fim, int size);

typedef struct {
    int ini, mid, fim;
} thread_arg;

int size;
int *v;

void *thread_func(void* arg){
    thread_arg *a = arg;
    quicksort(v, a->ini, a->mid, a->fim);
    printf("done in\n");
    return NULL;
}

int main()
{
    // initializations
    scanf("%d", &size);
    v = malloc(size * sizeof(int));

    // read array
    for (int i = 0; i < size; ++i)
        scanf("%d", &v[i]);

    // arguments
    thread_arg argument1, argument2;
    int mid = partition(v, 0, size-1);

    argument1.ini = 0;
    argument1.mid = mid-1;
    argument1.fim = size;

    argument2.ini = mid;
    argument2.mid = size-1;
    argument2.fim = size;

    pthread_t thread1, thread2;    

    // thread and execution
    pthread_create(&thread1, NULL, thread_func, &argument1);
    printf("done out 1\n");
    pthread_create(&thread2, NULL, thread_func, &argument2);
    printf("done out 2\n");

    pthread_join(thread1, NULL);
    pthread_join(thread2, NULL);

    free(v);

    return 0;
}

void quicksort (int *v, int ini, int fim, int size){
    if (ini >= fim)
        return;
    int meio = partition(v, ini, fim);
    quicksort(v, ini, meio-1, size);
    quicksort(v, meio+1, fim, size);
}

int partition (int *v, int ini, int fim){
    int i = ini-1, j = ini;
    troca(v, (ini+fim)/2, fim);
    int pivo = v[fim];

    for (; j < fim; ++j)
    {
        if (v[j] <= pivo)
        {
            i++;
            troca(v, i, j);
        }
    }
    troca(v, i+1, fim);
    return i+1; //indice do pivo;
}


void troca (int *v, int i, int j){
    int aux = v[i];
    v[i] = v[j];
    v[j] = aux;
    return;
}

The execution works and sorts perfectly, it does generates 2 new independent threads that sorts halves of the array each. The problem is, it does not do it at the same time. Running the program with a input of 100m random numbers:

done out 1
done out 2
done in
done in

real    0m47,464s
user    0m50,686s
sys     0m0,452s

But it takes about 25 seconds for the first 3 lines to appear, and ~25 for the last one, which indicates that the second thread is waiting for the first one to run.

In the htop console, it appears that in some point both run at the same time (this is backed at the fact that this program runs a bit faster than my normal one)

Finally, I understand that it is not safe to work simultaneously with this sort of data, but on this sorting example it should be fine.

4
  • argument1.fim = size; Should that be size / 2? Commented Jan 14, 2021 at 20:33
  • @JohnnyMopp, this size is not included in the sorting and it serves the purpose of not exceeding the array boundaries only Commented Jan 14, 2021 at 20:36
  • Are you sure you run it a machine/setup that allows two threads to run at the same time ? Commented Jan 14, 2021 at 20:37
  • Pretty sure, I'm running on Debian 10 buster on a intel i5-3337U quad core. And it shows for a brief period all 3 processes running on htop Commented Jan 14, 2021 at 20:42

3 Answers 3

1

You don't divide the work fairly among the threads. To see this, modify your code as follows:

    // thread and execution
    pthread_create(&thread1, NULL, thread_func, &argument1);
    printf("done out 1 (%d)\n", argument1.mid - argument1.ini + 1);
    pthread_create(&thread2, NULL, thread_func, &argument2);
    printf("done out 2 (%d)\n", argument2.mid - argument2.ini + 1);

You will see that one thread tends to have about twice as much work as the other.

For example, here are a few runs using random data:

done out 1 (66474145)
done out 2 (33525855)

done out 1 (21794872)
done out 2 (78205128)

done out 1 (67867800)
done out 2 (32132200)

You should never divide your work into a small number of very big chunks and assign each chunk to a thread if you care about concurrency. Instead, create a queue of small tasks and let threads pull tasks from the queue as they finish.

Sign up to request clarification or add additional context in comments.

4 Comments

Thanks, maybe quick sort is not a good example of a task to be multi threaded
@ACoelho It's fine, you just don't want to divide the work to the threads at the beginning but as you go and you don't want to force which thread does what work but allow the implementation to find the optimal assignment.
I'm learning about multi threading now, so thanks for the input
Here's a quick and dirty fixed version: onlinegdb.com/By1GarCCD
1

The threads are running concurrently (well, not necessarily concurrently, but perceivably concurrently). The 25 second delay you are seeing is due to a bug in your quick sort (or perhaps the way you're sharing the list between your two threads). Essentially, thread 2 is assigned much more work than thread 1, and so it takes much longer to complete. Thread 2 is not simply executing "after" thread 1, or "waiting" for thread 1.

To prove this, I added an unsigned long* argument to quicksort and had it increment the value referenced by said pointer at each call (essentially counting the number of times each thread calls quicksort), and I ran it with 10M (not 100M) random values. Thread 1's count ended up at 3,851,991, and thread 2's count ended up at 9,693,697. Sure, there can be some small variation between the two counts due to the randomness in the generation of the list. But the difference is nearly a factor of 3, which is far more significant than you could possibly expect from slight random variations.

I suggest you try a different implementation of quicksort (one which is known to work). I also suggest being more careful with data sharing (make sure the two threads never access each others' data, especially without synchronization) to get a more accurate measure of timing; the last thing you want is for one thread to be sorting or unsorting the other thread's data.

1 Comment

Yes, quick sort is not easily dividable in the first partition and is rarely divided in 2 equal parts, i'll try some more threading with more homogeneous code
0

The thread creation is not fair: thread#1 gets created before thread#2. Moreover, when thread#1 is created, it may run and preempt the main thread which may wait for it to give back the CPU to create and start thread#2. However, the thread running under the default SCHED_OTHER policy have an unpredictable behavior.

To add some predictability:

  • Make the threads start at the same time once created. Use a barrier to trigger a "go" for all the threads at the same time. Cf. pthread_barrier_init()
#include <stdio.h>
#include <stdlib.h>
#include <pthread.h>

void troca (int *v, int i, int j);
int partition (int *v, int ini, int fim);
void quicksort (int *v, int ini, int fim, int size);



pthread_barrier_t barrier;

typedef struct {
    int id;
    int ini, mid, fim;
} thread_arg;

int size;
int *v;

void *thread_func(void* arg){
    thread_arg *a = arg;
    pthread_barrier_wait(&barrier);
    quicksort(v, a->ini, a->mid, a->fim);
    printf("done in %d\n", a->id);
    return NULL;
}

int main()
{
    // initializations
    scanf("%d", &size);
    v = malloc(size * sizeof(int));

    // read array
    for (int i = 0; i < size; ++i)
        scanf("%d", &v[i]);

    // arguments
    thread_arg argument1, argument2;
    int mid = partition(v, 0, size-1);

    argument1.id = 1;
    argument1.ini = 0;
    argument1.mid = mid-1;
    argument1.fim = size;

    argument2.id = 2;
    argument2.ini = mid;
    argument2.mid = size-1;
    argument2.fim = size;

    pthread_t thread1, thread2;    

    pthread_barrier_init(&barrier, NULL, 3);

    // thread and execution
    pthread_create(&thread1, NULL, thread_func, &argument1);
    printf("done out 1\n");
    pthread_create(&thread2, NULL, thread_func, &argument2);
    printf("done out 2\n");

    // Start the threads
    pthread_barrier_wait(&barrier);

    pthread_join(thread1, NULL);
    pthread_join(thread2, NULL);

    free(v);
    pthread_barrier_destroy(&barrier);

    return 0;
}

void quicksort (int *v, int ini, int fim, int size){
    if (ini >= fim)
        return;
    int meio = partition(v, ini, fim);
    quicksort(v, ini, meio-1, size);
    quicksort(v, meio+1, fim, size);
}

int partition (int *v, int ini, int fim){
    int i = ini-1, j = ini;
    troca(v, (ini+fim)/2, fim);
    int pivo = v[fim];

    for (; j < fim; ++j)
    {
        if (v[j] <= pivo)
        {
            i++;
            troca(v, i, j);
        }
    }
    troca(v, i+1, fim);
    return i+1; //indice do pivo;
}


void troca (int *v, int i, int j){
    int aux = v[i];
    v[i] = v[j];
    v[j] = aux;
    return;
}

Comments

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.