29

Recently tried the following program and it compiles, runs fine and produces expected output instead of any runtime error.

#include <iostream>
class demo
{
    public:
        static void fun()
        {
            std::cout<<"fun() is called\n";
        }
        static int a;
};
int demo::a=9;
int main()
{
    demo* d=nullptr;
    d->fun();
    std::cout<<d->a;
    return 0;
}

If an uninitialized pointer is used to access class and/or struct members behaviour is undefined, but why it is allowed to access static members using null pointers also. Is there any harm in my program?

12
  • 5
    Is there any harm in my program? It is still UB. Commented Feb 12, 2015 at 16:38
  • 13
    Undefined behavior does not mean that the code is required to crash; rather it means that anything at all is allowed to happen, the result is undefined. That is, the code could appear to work fine and as expected, it could crash, it could appear to run fine but give you the wrong result, anything at all. Commented Feb 12, 2015 at 16:38
  • 4
    Voted to reopen; the linked question addresses non-static members, not static ones. Commented Feb 12, 2015 at 16:41
  • 3
    The biggest problem is maintainability. It should be demo::f() and demo::a, and if someone later edits the code they might actually try to use that pointer. Commented Feb 12, 2015 at 17:28
  • 3

5 Answers 5

35

TL;DR: Your example is well-defined. Merely dereferencing a null pointer is not invoking UB.

There is a lot of debate over this topic, which basically boils down to whether indirection through a null pointer is itself UB.
The only questionable thing that happens in your example is the evaluation of the object expression. In particular, d->a is equivalent to (*d).a according to [expr.ref]/2:

The expression E1->E2 is converted to the equivalent form (*(E1)).E2; the remainder of 5.2.5 will address only the first option (dot).

*d is just evaluated:

The postfix expression before the dot or arrow is evaluated;65 the result of that evaluation, together with the id-expression, determines the result of the entire postfix expression.

65) If the class member access expression is evaluated, the subexpression evaluation happens even if the result is unnecessary to determine the value of the entire postfix expression, for example if the id-expression denotes a static member.

Let's extract the critical part of the code. Consider the expression statement

*d;

In this statement, *d is a discarded value expression according to [stmt.expr]. So *d is solely evaluated1, just as in d->a.
Hence if *d; is valid, or in other words the evaluation of the expression *d, so is your example.

Does indirection through null pointers inherently result in undefined behavior?

There is the open CWG issue #232, created over fifteen years ago, which concerns this exact question. A very important argument is raised. The report starts with

At least a couple of places in the IS state that indirection through a null pointer produces undefined behavior: 1.9 [intro.execution] paragraph 4 gives "dereferencing the null pointer" as an example of undefined behavior, and 8.3.2 [dcl.ref] paragraph 4 (in a note) uses this supposedly undefined behavior as justification for the nonexistence of "null references."

Note that the example mentioned was changed to cover modifications of const objects instead, and the note in [dcl.ref] - while still existing - is not normative. The normative passage was removed to avoid commitment.

However, 5.3.1 [expr.unary.op] paragraph 1, which describes the unary "*" operator, does not say that the behavior is undefined if the operand is a null pointer, as one might expect. Furthermore, at least one passage gives dereferencing a null pointer well-defined behavior: 5.2.8 [expr.typeid] paragraph 2 says

If the lvalue expression is obtained by applying the unary * operator to a pointer and the pointer is a null pointer value (4.10 [conv.ptr]), the typeid expression throws the bad_typeid exception (18.7.3 [bad.typeid]).

This is inconsistent and should be cleaned up.

The last point is especially important. The quote in [expr.typeid] still exists and appertains to glvalues of polymorphic class type, which is the case in the following example:

int main() try {

    // Polymorphic type
    class A
    {
        virtual ~A(){}
    };

    typeid( *((A*)0) );

}
catch (std::bad_typeid)
{
    std::cerr << "bad_exception\n";
}

The behavior of this program is well-defined (an exception will be thrown and catched), and the expression *((A*)0) is evaluated as it isn't part of an unevaluated operand. Now if indirection through null pointers induced UB, then the expression written as

*((A*)0);

would be doing just that, inducing UB, which seems nonsensical when compared to the typeid scenario. If the above expression is merely evaluated as every discarded-value expression is1, where is the crucial difference that makes the evaluation in the second snippet UB? There is no existing implementation that analyzes the typeid-operand, finds the innermost, corresponding dereference and surrounds its operand with a check - there would be a performance loss, too.

A note in that issue then ends the short discussion with:

We agreed that the approach in the standard seems okay: p = 0; *p; is not inherently an error. An lvalue-to-rvalue conversion would give it undefined behavior.

I.e. the committee agreed upon this. Although the proposed resolution of this report, which introduced so-called "empty lvalues", was never adopted…

However, “not modifiable” is a compile-time concept, while in fact this deals with runtime values and thus should produce undefined behavior instead. Also, there are other contexts in which lvalues can occur, such as the left operand of . or .*, which should also be restricted. Additional drafting is required.

that does not affect the rationale. Then again, it should be noted that this issue even precedes C++03, which makes it less convincing while we approach C++17.


CWG-issue #315 seems to cover your case as well:

Another instance to consider is that of invoking a member function from a null pointer:

  struct A { void f () { } };
  int main ()
  {
    A* ap = 0;
    ap->f ();
  }

[…]

Rationale (October 2003):

We agreed the example should be allowed. p->f() is rewritten as (*p).f() according to 5.2.5 [expr.ref]. *p is not an error when p is null unless the lvalue is converted to an rvalue (4.1 [conv.lval]), which it isn't here.

According to this rationale, indirection through a null pointer per se does not invoke UB without further lvalue-to-rvalue conversions (=accesses to stored value), reference bindings, value computations or the like. (Nota bene: Calling a non-static member function with a null pointer should invoke UB, albeit merely hazily disallowed by [class.mfct.non-static]/2. The rationale is outdated in this respect.)

I.e. a mere evaluation of *d does not suffice to invoke UB. The identity of the object is not required, and neither is its previously stored value. On the other hand, e.g.

*p = 123;

is undefined since there is a value computation of the left operand, [expr.ass]/1:

In all cases, the assignment is sequenced after the value computation of the right and left operands

Because the left operand is expected to be a glvalue, the identity of the object referred to by that glvalue must be determined as mentioned by the definition of evaluation of an expression in [intro.execution]/12, which is impossible (and thus leads to UB).


1 [expr]/11:

In some contexts, an expression only appears for its side effects. Such an expression is called a discarded-value expression. The expression is evaluated and its value is discarded. […]. The lvalue-to-rvalue conversion (4.1) is applied if and only if the expression is a glvalue of volatile-qualified type and […]

Sign up to request clarification or add additional context in comments.

35 Comments

It does not look as if that resolution ever made it into any standard. Probably because of what it would mean for references...
If you are aware that the resolution was never made law, then why quote it as if it was? Better find something which did, especially as that one is 13+ years old (two minor and one major standard in the interim).
"Dereferencing a null pointer doesn't invoke UB without further lvalue-to-rvalue conversions (=accesses to stored value) or reference bindings" that's what some want, not what actually is. Happy with my extending the quote?
@Columbo Clang and GCC has a whole set of them. gcc.gnu.org/onlinedocs/gcc/Debugging-Options.html, search for -fsanitize.
@Columbo A sanitizer that turns well-defined code into errors is going to be pretty annoying IMO. Regardless, I see no reason why the standard shouldn't restrict it, if permitting it only allows "insane" code.
|
4

From the C++ Draft Standard N3337:

9.4 Static members

2 A static member s of class X may be referred to using the qualified-id expression X::s; it is not necessary to use the class member access syntax (5.2.5) to refer to a static member. A static member may be referred to using the class member access syntax, in which case the object expression is evaluated.

And in the section about object expression...

5.2.5 Class member access

4 If E2 is declared to have type “reference to T,” then E1.E2 is an lvalue; the type of E1.E2 is T. Otherwise, one of the following rules applies.

— If E2 is a static data member and the type of E2 is T, then E1.E2 is an lvalue; the expression designates the named member of the class. The type of E1.E2 is T.

Based on the last paragraph of the standard, the expressions:

  d->fun();
  std::cout << d->a;

work because they both designate the named member of the class regardless of the value of d.

10 Comments

"If E2 is a static data member ... the expression designates the named member." It says it right there, and it makes sense because the pointer is completely irrelevant.
@KennyOstrom: I don't see any mandate to ignore UB invoked by the expression E1 in that quote, sorry.
@RSahu: If that was the case, they would have designated it as an "unevaluated context", which they explicitly did not.
@RSahu The LHS of . needs to be evaluated even if it's static, or g().f() might not evaluate g().
Even though the expression designates the named member of the class, that doesn't "bypass" the evaluation of E1 . A more stark example, g()->a where g divides by zero for example, and then returns a null pointer
|
4

runs fine and produces expected output instead of any runtime error.

That's a basic assumption error. What you are doing is undefined behavior, which means that your claim for any kind of "expected output" is faulty.

Addendum: Note that, while there is a CWG defect (#315) report that is closed as "in agreement" of not making the above UB, it relies on the positive closing of another CWG defect (#232) that is still active, and hence none of it is added to the standard.

Let me quote a part of a comment from James McNellis to an answer to a similar Stack Overflow question:

I don't think CWG defect 315 is as "closed" as its presence on the "closed issues" page implies. The rationale says that it should be allowed because "*p is not an error when p is null unless the lvalue is converted to an rvalue." However, that relies on the concept of an "empty lvalue," which is part of the proposed resolution to CWG defect 232, but which has not been adopted.

4 Comments

@Columbo: Now if you would show that those resolutions ever made it into the standard, you would have a point.
@Columbo: I added a blurb addendum from James McNellis that clarifies why your answer doesn't dispute that it's UB.
Now this is the only correct answer. Shame I cannot upvote it again.
Is there any reason why a quality implementation should ever care about the instance in this scenario? I suppose it might be fair to warn that code might get broken by compiler designers which take pride in finding "clever" ways to avoid making their compilers do anything not mandated by the Standard. On the other hand, the Standard only defines a "conforming" implementation, rather than an "implementation whose usefulness isn't undermined by obtuseness"; the fact that an some particular behavior would be allowable by the former doesn't mean the latter could behave likewise.
1

What you are seeing here is what I would consider an ill-conceived and unfortunate design choice in the specification of the C++ language and many other languages that belong to the same general family of programming languages.

These languages allow you to refer to static members of a class using a reference to an instance of the class. The actual value of the instance reference is of course ignored, since no instance is required to access static members.

So, in d->fun(); the the compiler uses the d pointer only during compilation to figure out that you are referring to a member of the demo class, and then it ignores it. No code is emitted by the compiler to dereference the pointer, so the fact that it is going to be NULL during runtime does not matter.

So, what you see happening is in perfect accordance to the specification of the language, and in my opinion the specification suffers in this respect, because it allows an illogical thing to happen: to use an instance reference to refer to a static member.

P.S. Most compilers in most languages are actually capable of issuing warnings for that kind of stuff. I do not know about your compiler, but you might want to check, because the fact that you received no warning for doing what you did might mean that you do not have enough warnings enabled.

9 Comments

Your proposed change would possibly break existing code. template <class T> f(T &t) { t.g(); } struct StillWorking { void g() {} }; struct NowBroken { static void g(); } }; f(StillWorking()); f(NowBroken());
"It should be impossible to use an instance reference to refer to a static member." That ship has sailed a long, long time ago.
P.S.: I meant to write an example with const& (mine won't compile anyway), but the point stands.
@T.C. yes, it has sailed, but am I not entitled to have the opinion that it should never have been this way?
@ChristianHackl would you be satisfied if I had written "It should have been impossible" instead of "It should be impossible" ?
|
1

The expressions d->fun and d->a() both cause evaluation of *d ([expr.ref]/2).

The complete definition of the unary * operator from [expr.unary.op]/1 is:

The unary * operator performs indirection: the expression to which it is applied shall be a pointer to an object type, or a pointer to a function type and the result is an lvalue referring to the object or function to which the expression points.

For the expression d there is no "object or function to which the expression points" . Therefore this paragraph does not define the behaviour of *d.

Hence the code is undefined by omission, since the behaviour of evaluating *d is not defined anywhere in the Standard.

1 Comment

@HolyBlackCat That is correct but my answer depends on the text "the object or function to which the expression points"

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.