8

I ran into a very weird problem today. Long story short, my function returns one value, the caller gets a different value. Somewhere around my code I have a call to:

Message* m = NULL;
m = connection_receive(c);

Where connection_receive is defined as follows:

Message* connection_receive(Connection* c)
{
Message* k;

    if (c->state == CON_STATE_AUTHENTICATED)
    {
        pthread_mutex_lock(&c->mutex_in);

        if (g_queue_is_empty(c->in))
            k = NULL;
        else
            k = (Message*)g_queue_pop_head(c->in);

        pthread_mutex_unlock(&c->mutex_in);
        /* Until here, k is reachable and contains the correct data. */
        return k;
    }
    else
        return NULL; 
}

Here's a gdb run, I stopped right before the return and right after the assignment:

222         return k;
(gdb) p k
$1 = (Message *) 0x7ffff0000950
(gdb) n
226 }
(gdb) n
main () at src/main.c:57
57              if (m)
(gdb) p m
$2 = (Message *) 0xfffffffff0000950

Of course, if we try to access 0xfffffffff0000950 we'll get a segmentation fault.

If I change the function and instead of returning a value, using a second parameter to pass the value it works, but I would like to know what went wrong on this one.

Thank you very much.

EDIT: This works, but it's not convenient. And I would also like to know why such strange error is happening.

void connection_receive2(Connection* c, Message** m)
{
    if (c->state == CON_STATE_AUTHENTICATED)
    {
        pthread_mutex_lock(&c->mutex_in);

        if (g_queue_is_empty(c->in))
            *m = NULL;
        else
            *m = (Message*)g_queue_pop_head(c->in);

        pthread_mutex_unlock(&c->mutex_in);
    }
    else
        *m = NULL;
}

EDIT2: Solved. Thanks all. The problem was a typo on the header file. I can't use -Werror because I need to do things which raise some warnings, and in a large make output and large header I missed it.

6
  • I had that happen once. fin showed one return value and the variable it was stored into showed another. A recompile fixed it for me (didn't even change the source at all); best I can guess is a dependent object file didn't get recompiled, otherwise possibly a (rarely encountered) bug in gcc. Commented Nov 6, 2011 at 18:48
  • You need to boil this down to a SSCCE that demonstrates the problem and post it - there's nothing wrong with the code you've posted; the problem lies elsewhere. Commented Nov 6, 2011 at 18:52
  • @BrianRoach, I'll see what I can do. It's kind of a big project. Commented Nov 6, 2011 at 22:41
  • 2
    You might see this behavior if there is a missing prototype for connection_receive() where you call connection_receive() (as the return value will be converted a signed int and back to a pointer again, since the compiler will assume the function to return an int) Commented Nov 6, 2011 at 22:52
  • @Victor so the prototype didn't match the definition? Yeah, that's a nasty one. Commented Nov 6, 2011 at 23:40

3 Answers 3

6
  1. How is your m defined?
  2. Has your caller access to the right prototype?
  3. What architecture are you on?

I suspect that there is a mismatch with the types and that my question 2 is the crux of all.

You are returning a pointer with (I suppose so) 48 or 64 bits. The caller, however, thinks to get a int, which has maybe 32 bits and is signed. On converting back to a pointer, the value gets sign-extended.

Sign up to request clarification or add additional context in comments.

5 Comments

+1, I'm pretty sure (2) is correct. This is the reason you should always develop with the maximum compile warnings set (-Wall -Werror on gcc), since it'll catch this sort of thing.
yes, the lower half of the return value being exactly the same is very suspicious.
@therefromhere I'm sorry, I forgot to say. My "m" is also a Message*. Yes, the caller has access to the same prototype, I'm in fact testing the module. Neither -Wall nor -Wextra give me anything, it's compiling like a charm. I'm building and runing it on a Fedora 15, 2.6.40.6-0 64bits
Bull's eye. Thanks. After reading your comment I decided to recheck the header file, there was a typo on the definition.
So module testing again proved to be very helpful :-)
0

Did you push a malloc:ed object on the queue? If not and you instead pushed a stack object then when you may end up with weird behavior when you pop items.

1 Comment

It's a malloc'ed object. I edited the original post, and posted a equivalent definition that works fine. It's really strange!
0

We faced the same problem and the root cause was the implicit declaration of the function connection_receive(). So it was defaulted to int which signed and then stored in m.

Comments

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.