12

To understand the code of pthread condition variables, I have written my own version. Does it look correct? I am using it in a program, its working, but working surprisingly much faster. Originally the program takes around 2.5 seconds and with my version of condition variables it takes only 0.8 seconds, and the output of the program is correct too. However, I'm not sure, if my implementation is correct.

struct cond_node_t
{
    sem_t s;
    cond_node_t * next;
};

struct cond_t
{
    cond_node_t * q;                // Linked List
    pthread_mutex_t qm;                 // Lock for the Linked List
};

int my_pthread_cond_init( cond_t * cond )
{
    cond->q = NULL;
    pthread_mutex_init( &(cond->qm), NULL );
}

int my_pthread_cond_wait( cond_t* cond, pthread_mutex_t* mutex )
{
    cond_node_t * self;

    pthread_mutex_lock(&(cond->qm));
    self = (cond_node_t*)calloc( 1, sizeof(cond_node_t) );
    self->next = cond->q;
    cond->q = self;
    sem_init( &self->s, 0, 0 );
    pthread_mutex_unlock(&(cond->qm));

    pthread_mutex_unlock(mutex);
    sem_wait( &self->s );
    free( self ); // Free the node
    pthread_mutex_lock(mutex);
}

int my_pthread_cond_signal( cond_t * cond )
{
    pthread_mutex_lock(&(cond->qm));
    if (cond->q != NULL) 
    {
        sem_post(&(cond->q->s));
        cond->q = cond->q->next;
    }
    pthread_mutex_unlock(&(cond->qm));
}

int my_pthread_cond_broadcast( cond_t * cond )
{
    pthread_mutex_lock(&(cond->qm));
    while ( cond->q != NULL) 
    {
        sem_post( &(cond->q->s) );
        cond->q = cond->q->next;
    }
    pthread_mutex_unlock(&(cond->qm));
}
4
  • You are freeing the self node without removing it from the list. Commented Jun 12, 2012 at 17:49
  • 1
    @n.m. The self node is taken off by signal and broadcast. Commented Jun 12, 2012 at 18:48
  • 1
    I realize this is purely educational, but I thought I should mention semaphores don't need a mutex to guarantee signals aren't lost. If you sem_post while no one's listening, sem_wait will still pick up on it the next time you call it. Semaphores are basically atomic counters that block to keep from going negative. Commented Jun 12, 2012 at 23:27
  • I think my_pthread_cond_signal and `my_pthread_cond_broadcast´ have the same implementation, is that correct? Commented Feb 17, 2021 at 14:54

3 Answers 3

6

Apart from the missing return value checks, there are some more issues that should be fixable:

  • sem_destroy is not called.
  • Signal/broadcast touch the cond_node_t after waking the target thread, potentially resulting in a use-after-free.

Further comments:

  • The omitted destroy operation may require changes to the other operations so it is safe to destroy the condition variable when POSIX says it shall be safe. Not supporting destroy or imposing stronger restrictions on when it may be called will simplify things.
  • A production implementation would handle thread cancellation.
  • Backing out of a wait (such as required for thread cancellation and pthread_cond_timedwait timeouts) may lead to complications.
  • Your implementation queues threads in userland, which is done in some production implementations for performance reasons; I do not understand exactly why.
  • Your implementation always queues threads in LIFO order. This is often faster (such as because of cache effects) but may lead to starvation. Production implementation may use FIFO order sometimes to avoid starvation.
Sign up to request clarification or add additional context in comments.

Comments

5

You don't seem to respect this requirement:

These functions atomically release mutex and cause the calling thread to block on the condition variable cond; atomically here means "atomically with respect to access by another thread to the mutex and then the condition variable". That is, if another thread is able to acquire the mutex after the about-to-block thread has released it, then a subsequent call to pthread_cond_broadcast() or pthread_cond_signal() in that thread shall behave as if it were issued after the about-to-block thread has blocked.

You unlock and then wait. Another thread can do lots of things between these operations.

P.S. I'm not sure myself if I'm interpreting this paragraph correctly, feel free to point out my error.

1 Comment

I don't see that this is a problem here. The semaphore is correctly initialized. So even if another thread kicks in, the semaphore will store any token that will be posted by a signal or broadcast. Such a post operation can happen before the thread actually calls sem_wait or while it is already in. In both cases the thread will then continue execution.
4

Basically your strategy looks ok, but you have one major danger, some undefined behavior, and a nit pick:

  • your are not inspecting the return values of your POSIX functions. In particular sem_wait is interruptible so under heavy load or bad luck your thread will be woken up spuriously. You'd have to carefully catch all that
  • none of your functions returns a value. If some user of the functions will decide to use the return values some day, this is undefined behavior. Carefully analyse the error codes that the condition functions are allowed to return and do just that.
  • don't cast the return of malloc or calloc

Edit: Actually, you don't need malloc/free at all. A local variable would do as well.

2 Comments

Why not cast the return of malloc/calloc?
First, most important, it is useless. It is valid to assign void* to any pointer, this is what it is made for in C. Second, only use casts if they are absolutely necessary. They are hard to find textually and switch off all warnings and diagnostics. Third, this can hide a subtle bug, when you forget the #include and the compiler takes this to return int.

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.