2

I have the following C code:

#include <stdlib.h>
#include <stdio.h>

char* foo() {
    char abc[4] = "abc";
    return abc;
}

int main() {
    printf("%s", foo());
    return 0;
}

If I compile it with gcc and run the executable file, I got (null)% as output.

If I run the slightly modified code:

#include <stdlib.h>
#include <stdio.h>

char* foo() {
    char abc[4] = "abc";
    return abc;
}

int main() {
    printf("%c", *(foo()));
    return 0;
}

I got a segmentation fault.

My question is: why wouldn't my first code get a segmentation fault? I'm running Linux and gcc version: gcc (Ubuntu 7.4.0-1ubuntu1~18.04.1) 7.4.0

Both code, when compiled, will generate a warning: function returns address of local variable [-Wreturn-local-addr] warning

13
  • 1
    Re “why wouldn't my first code get a segmentation fault?”: In both cases, it appears the program acted as if a null pointer were returned by foo. (This behavior is not defined by the C standard, may be the result of optimizer behavior, and is not something you should rely on.) When you passed the pointer to printf for %s, the printf implementation checked whether it was a null pointer and printed “(null)” instead of attempting to dereference it. When you attempted to dereference the pointer yourself to pass to printf for %c, there was no preliminary check, so the program crashed. Commented Sep 12, 2020 at 17:15
  • The question "why doesn't this obviously incorrect program crash" never has an interesting answer. It just got lucky this time. Move along, citizen, nothing to see here. Don't write incorrect programs. Obviously this is easier said than done, but the very least thing you can do is paying attention to compiler warnings. Commented Sep 12, 2020 at 17:58
  • @n.'pronouns'm.: “The question "why doesn't this obviously incorrect program crash" never has an interesting answer” is false. There are things to be learned, including things about how compilers work, how linkers work, how operating systems work, and more. In this case, at least three of the answers were wrong, so clearly there was information to be learned. In general, learning about the causes of crashes helps diagnose future bugs, thus speeding, improving, and reducing the cost of software development. Commented Sep 12, 2020 at 18:47
  • @n.'pronouns'm.: No, it is not obvious. There are multiple things to learn there. One is the rule that was violated. It was not that dereferencing a pointer to an object whose lifetime has ended is undefined. The rule that was violated is that using such a pointer has undefined behavior. Three answers got that wrong, which means three people, and possibly more readers, did not know or notice a rule that could have caused other programs to misbehave. Learning the correct rule, and learning to recognize it, is useful for avoiding bugs. Commented Sep 12, 2020 at 23:57
  • @n.'pronouns'm.: Another thing to learn is how the behavior manifested. The compiler did not do a simple thing here. A straightforward implementation of the code would have left an address behind. It did not. Many people are not aware the compiler makes such overarching transformations of a program. They learn a simple model of C computing taught in classes and see optimization as things like consolidation of common subexpressions, or maybe rewriting some arithmetic. Learning that today’s compilers make large abstract transformations is new. Commented Sep 12, 2020 at 23:59

5 Answers 5

4

At the moment return abc; starts to execute, abc is a pointer to an array defined inside foo. (Formally, it designates the array itself, but it is automatically converted to the address of the first element.) The function would be returning this pointer value. However, when execution of the function ends, the lifetime of the array ends.

Per C 2018 6.2.4 2:

The value of a pointer becomes indeterminate when the object it points to (or just past) reaches the end of its lifetime.

When a value is indeterminate in C, it may behave as if it has any value, including having a different value each time you attempt to use it or having a trap value (C 2018 3.19.2 and 3.19.3). Note that this does not just mean what the pointer value points to is indeterminate; the value of the pointer itself is indeterminate.

So, even if abc had some address in memory, say 100400, that does not mean 100400 is returned to the caller. The value returned to the caller is indeterminate: It can be anything, including a null pointer value.

It appears your compiler’s optimizer has responded to the undefined behavior in your code by providing or allowing a null pointer value as the return value of the function foo. This is allowed by the C standard.

When you passed this null pointer to printf for use with %s, your printf implementation checked the pointer, saw it was a null pointer, and printed “(null)” instead of attempting to use it to access a string in memory.

When you tried to dereference the pointer, using *(foo()), there was no preliminary check of the pointer value. The machine code of the program attempted to use the null pointer to access memory, and this resulted in a segment fault.

Sign up to request clarification or add additional context in comments.

4 Comments

I have a follow-up question: on another machine where I repeatedly run these two codes, the first code generates different nonsense strings at each run (this is expected). The second code, however, always successfully prints a%. What would be a good explanation of this? Would that compiler choose to somehow retain the actual value of the pointer pointing to abc and the compiler will give the caller access of that memory? Intuitively I would say that should not be a form of undefined behavior, or at least not a good undefined behavior? Or, is there something else going on here?
@Go_printf: Is that “%” of “a%” a typo? Because printing “a” can easily happen: foo is called, it initializes an array abc to contain “abc”, it returns the address of the first element of that array, this address survives compiler optimization, the caller dereferences it to get “a”, the caller passes that to printf, and printf prints “a”.
"Is that “%” of “a%” a typo?" The percentage sign appears in a contrasting background. I think that happens because I didn't add \n in printf. By "surviving compiler optimization" you mean that this compiler optimized to return the actual address of abc?
@Go_printf: Yes, if you do not print a new-line, the cursor will be left after the printed text when the program ends, after which the command-line shell will print its prompt, which may be the “%” character. By “surviving compiler optimization” I mean that part of the way in which a compiler may work is by generating code similar to what a human would write and then applying optimization techniques to it—and surviving this means that the initially generated code persists through the result of optimization, rather than being transformed into something different.
1

Because you are creating a local variable abc, that variable will only valid in the scope of the function foo. Returning the address of that variable makes no sense as as soon as you return from foo the address will not longer be valid. Also keep in mind C uses the stack to pass arguments to functions and to return from values from them. As well the local variable is also creating in the stack which will be modified by the function call mechanism, so using that address will corrupt the stack eventually.
To create pointers you should use heap allocation (using malloc family of functions) or you must ensure the variable is inside an existing scope by the time you use it.

1 Comment

Objects do not exist outside their lifetimes, not their scopes. Scopes are where names are visible. Lifetimes are when objects exist. An object can be accessed in code outside its scope during its lifetime by passing a pointer, as when passing the address of an object to a subroutine.
1

Your second code invokes undefined behavior as you try to dereference a pointer which points to a local variable. Now this local variable doesn't exists outside it's scope. Thus, the memory isn't valid

In first code, you try to access local variable outside it's scope. Now in this case function is expected to return a char *. As you return a local variable, what you get is null printing which doesn't cause segmentation fault.

1 Comment

Objects do not exist outside their lifetimes, not their scopes. Scopes are where names are visible. Lifetimes are when objects exist. An object can be accessed in code outside its scope during its lifetime by passing a pointer, as when passing the address of an object to a subroutine.
0

Consider the following sequence of events:

  • You check into a hotel, and you’re placed in room 137.
  • You tell the front desk to call your friends and invite them to a wild dance party in your room tomorrow at 2AM.
  • However, your reservation doesn't last until 2AM tomorrow. Perhaps there's another reservation for a different guest. Perhaps the room will stay vacant. Who knows. Maybe the front desk people know. Maybe they don't.

So what should the front desk do?

They could still send an invitation that indicates room 137, perhaps not knowing that you won't be there at that time, because they forgot to check their reservation records. Or maybe they just don't care.

Or they could refuse to send an invitation and tell you that.

Or perhaps they could just ignore your request, not send anything, and not tell anyone.

Or they could send and invitation, but indicate a bogus room number. Perhaps they have invitation blanks prepared beforehand, and they need to just fill in the time and the room number. But being technologically advanced as they are, they won't fill a room number if they know it is not reserved to this particular guest, and sent out one with a default room number — zero perhaps?

Perhaps if we live in the future, they might even send an electronic invitation with room 137 indicated in it — that will self-destruct the moment you check out from the hotel!

Whatever they do, they cannot send an invitation indicating a correct room number, because there is no correct room number. You won't be at any room number. So they do whatever. They may always choose one strategy to deal with this situation. Or they may flip a coin. Or perhaps different staff members will do different things. Who knows.

So some of their strategies will produce a spectacular crash (your friend wakes up a wrong guest at a wrong time, they call a police, and all doesn't end well).

Other strategies will produce less dramatic outcomes. Refuse to continue and let you know? Let you know something is wrong, but continue anyway? Ignore a dangerous instruction? Replace it with a less dangerous instruction? All of these things are possible.

This corresponds to what a compiler might do when you instruct it to do an obviously dangerous and illegal thing. Ignore the danger, or refuse to continue with a diagnostic message, or produce a diagnostic message and continue anyway, or skip the dangerous instruction altogether (but only if it is 100% sure the destruction is imminent), or tweak it slightly so that it is less dangerous. Real compilers actually do all of these things in different circumstances. The important thing is to know that asking a compiler for an impossible thing doesn't always result in the program actually attempting to do the impossible thing.


This answer is in part based on the answer https://stackoverflow.com/a/63862176/775806 which was deleted by its author.

Comments

-1

Dereferencing object which does not exist is an Undefined Behaviour.

Why first works: my guess is because compiler has optimized out the call to the function.

Comments

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.