11

Have a look at is simple example:

struct Base { /* some virtual functions here */ };
struct A: Base { /* members, overridden virtual functions */ };
struct B: Base { /* members, overridden virtual functions */ };

void fn() {
    A a;
    Base *base = &a;
    B *b = reinterpret_cast<B *>(base);
    Base *x = b;
    // use x here, call virtual functions on it
}

Does this little snippet have Undefined Behavior?

The reinterpret_cast is well defined, it returns an unchanged value of base, just with the type of B *.

But I'm not sure about the Base *x = b; line. It uses b, which has a type of B *, but it actually points to an A object. And I'm not sure, whether x is a "proper" Base pointer, whether virtual functions can be called with it.

14
  • I don't think the casting itself leads to UB, but attempting to use b to call virtual functions or use B only members will definitely lead to UB. Think x is safe though. Commented Apr 5, 2019 at 8:46
  • @Someprogrammerdude: yep, the question is, "Think x is safe though" true or not. I have a feeling, that while this seems harmless (it's a no-op) at the first sight, it is UB. Commented Apr 5, 2019 at 8:48
  • reinterpret_cast cannot safely convert between base and derived class pointers/references. b is not guaranteed to be a valid pointer. The only safe thing you can do with it is reinterpret_cast it back to the original type. Commented Apr 5, 2019 at 9:20
  • 1
    @n.m.: suppose that you convert back b with reinterpret_cast. It should give you a proper Base pointer. Now, that reinterpret_cast is nothing else, than a conversion to void *, then to Base *. My example code does something similar (it just doesn't have the conversion to void *, and the conversion to Base * is implicit, not through static_cast). Anyways, I'm just playing the devil's advocate here. I have an insight that the conversion is UB, but cannot backup this with the standard. Commented Apr 5, 2019 at 9:50
  • 1
    It seems that the line regarding implicit derived-to-base pointer conversion: The result of the conversion is a pointer to the base class subobject of the derived class object. (timsong-cpp.github.io/cppwp/conv.ptr#3), means that we have indeed dereferenced b and so hit UB. Commented Apr 5, 2019 at 15:38

4 Answers 4

4

static_cast (or an implicit derived-to-base-pointer conversion, which does exactly the same thing) is substantially different from reinterpret_cast. There is no guarantee that that the base subobject starts at the same address as the complete object.

Most implementations place the first base subobject at the same address as the complete object, but of course even such implementations cannot place two different non-empty base subobjects at the same address. (An object with virtual functions is not empty). When the base subobject is not at the same address as the complete object, static_cast is not a no-op, it involves pointer adjustment.

There are implementations that never place even the first base subobject at the same address as the complete object. It is allowed to place the base subobject after all members of derived, for example. IIRC the Sun C++ compiler used to layout classes this way (don't know if it's still doing that). On such an implementation, this code is almost guaranteed to fail.

Similar code with B having more than one base will fail on many implementations. Example.

Sign up to request clarification or add additional context in comments.

22 Comments

You've just raised another problem :) While what you're saying is true, what if Base *base = &a; doesn't move the pointer (this is usually the case)? Or I put an if into my code, so it checks for equality: if ((void*)base==(void*)&a), and only if that's true, then the code does the following things (reinterpret_cast + impl. conversion). Would it be still UB?
As the other answer points out, b is not a safely derived pointer, and so neither is x. It is implementation defined wether unsafely derived pointers can be dereferenced. I have just clarified the reason why it isn't considered safely derived.
It's strange that the standard says "A pointer value is a safely-derived pointer to a dynamic object". I don't know why dynamic is there. But anyways, checking the list under it, it seems that b is a safely-derived pointer, as it is a "the result of a reinterpret_­cast of a safely-derived pointer value" (I think we can consider base as a safely-derived pointer).
Indeed it's my mistake, apparently safely derived means something other than I thought. I don't know whether this is UB with the additional conditions you provide.
@geza this answer explains why reinterpret_cast is not a standard-blessed way to cast within a class hierarchy. If you want to ask whether one can exploit properties of object layout in a particular implementation in order to get more use of reinterpret_cast than permitted by the standard, you may want to ask a separate question (the answer will be a resounding "no").
|
1

The reinterpret_cast is valid (the result can be dereferenced) if the two classes are layout-compatible; that is

  • they both have standard layout,
  • they both have the same non-static data members

But the classes do not have standard layout because one of the requirements of StandardLayoutType it that the class has no virtual functions or virtual base classes.

Regarding the validity of pointers derived from conversions, the standard has this to say in the section on "Safely-derived pointers":

6.7.4.3 Safely-derived pointers

4. An implementation may have relaxed pointer safety, in which case the validity of a pointer value does not depend on whether it is a safely-derived pointer value. Alternatively, an implementation may have strict pointer safety, in which case a pointer value referring to an object with dynamic storage duration that is not a safely-derived pointer value is an invalid pointer value unless the referenced complete object has previously been declared reachable. [ Note: The effect of using an invalid pointer value (including passing it to a deallocation function) is undefined, see 6.7.4.2. This is true even if the unsafely-derived pointer value might compare equal to some safely-derived pointer value. —end note ] It is implementation-defined whether an implementation has relaxed or strict pointer safety.

5 Comments

However, the OP here does not directly access the memory with the result of reinterpret_cast. Another cast (static_cast) happens before the access.
But can another cast be used safely on the result of a cast that is invalid?
the result of the case is not "invalid". The value is still valid, but cannot be dereferenced (otherwise UB). (Same as nullptr, is valid value but cannot be dereferenced)
Still, the question remains. Can a "non-referenceable" be cast again into a "referenceable" one?
@BiagioFesta Yes with a cast back to the original type (X*->Y*->X*)
0

If A and B are a verbatim copy of each other (except for their names) and are declared in the same context (same namespace, same #defines, no __LINE__ usage), then common C++ compilers (gcc, clang) will produce two binary representations which are fully interchangeable.

If A and B use the same method signatures but the bodies of corresponding methods differ, it is unsafe to cast A* to B* because the optimization pass in the compiler could for example partially inline the body of void B::method() at the call site b->method() while the programmer's assumption could be that b->method() will call A::method(). Therefore, as soon as the programmer uses an optimizing compiler the behavior of accessing A through type B* becomes undefined.

Problem: All compilers are always at least to some extent "optimizing" the source code passed to them, even at -O0. In cases of behavior not mandated by the C++ standard (that is: undefined behavior), the compiler's implicit assumptions - when all optimizations are turned off - might differ from programmer's assumptions. The implicit assumptions have been made by the developers of the compiler.

Conclusion: If the programmer is able to avoid using an optimizing compiler then it is safe to access A via B*. The only issue such a programmer needs to tackle with is that non-optimizing compilers do not exist.


A managed C++ implementation might abort the program when A* is casted to B* via reinterpret_cast, when b->field is accessed, or when b->method() is called. Some other managed C++ implementation might try harder to avoid a program crash and so it will resort to temporary duck typing when it sees the program accessing A via B*.

Some questions are:

  • Can the programmer guess what the managed C++ implementation will do in cases of behavior not mandated by the C++ standard?
  • What if the programmer sends the code to another programmer who will pass it to a different managed C++ implementation?
  • If a case isn't covered by the C++ standard, does it mean that a C++ implementation can choose to do anything it considers appropriate in order to cope with the case?

2 Comments

My question has the tag of language-lawyer. This means that it doesn't matter what compilers do. The question is, what the standard says.
@geza Yes, although on the other hand a programmer is always accessing the C++ standard via a particular C++ implementation. From language-lawyer viewpoint, the only correct acceptable answer is just a single line: "The standard does not cover the case." - adding anything on top of that is completely superfluous from language-lawyer viewpoint and you shouldn't have accepted it as the best answer to your question.
-1

Yes, It does have undefined behavior. The layout about suboject of Base in A and B is undefined. x may be not a real Base oject.

Comments

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.