Is converting a reinterpret_cast'd derived class pointer to base class pointer undefined behavior?

Question

Have a look at is simple example:

struct Base { /* some virtual functions here */ };
struct A: Base { /* members, overridden virtual functions */ };
struct B: Base { /* members, overridden virtual functions */ };

void fn() {
    A a;
    Base *base = &a;
    B *b = reinterpret_cast<B *>(base);
    Base *x = b;
    // use x here, call virtual functions on it
}

Does this little snippet have Undefined Behavior?

The reinterpret_cast is well defined, it returns an unchanged value of base, just with the type of B *.

But I'm not sure about the Base *x = b; line. It uses b, which has a type of B *, but it actually points to an A object. And I'm not sure, whether x is a "proper" Base pointer, whether virtual functions can be called with it.

I don't think the casting itself leads to UB, but attempting to use b to call virtual functions or use B only members will definitely lead to UB. Think x is safe though. — Some programmer dude
– Some programmer dude, Commented Apr 5, 2019 at 8:46
@Someprogrammerdude: yep, the question is, "Think x is safe though" true or not. I have a feeling, that while this seems harmless (it's a no-op) at the first sight, it is UB. — geza
– geza, Commented Apr 5, 2019 at 8:48
reinterpret_cast cannot safely convert between base and derived class pointers/references. b is not guaranteed to be a valid pointer. The only safe thing you can do with it is reinterpret_cast it back to the original type. — n. m. could be an AI
– n. m. could be an AI, Commented Apr 5, 2019 at 9:20
@n.m.: suppose that you convert back b with reinterpret_cast. It should give you a proper Base pointer. Now, that reinterpret_cast is nothing else, than a conversion to void *, then to Base *. My example code does something similar (it just doesn't have the conversion to void *, and the conversion to Base * is implicit, not through static_cast). Anyways, I'm just playing the devil's advocate here. I have an insight that the conversion is UB, but cannot backup this with the standard. — geza
– geza, Commented Apr 5, 2019 at 9:50
It seems that the line regarding implicit derived-to-base pointer conversion: The result of the conversion is a pointer to the base class subobject of the derived class object. (timsong-cpp.github.io/cppwp/conv.ptr#3), means that we have indeed dereferenced b and so hit UB. — Lawrence
– Lawrence, Commented Apr 5, 2019 at 15:38

n. m. could be an AI · Accepted Answer · 2019-04-05 11:36:03Z

4

static_cast (or an implicit derived-to-base-pointer conversion, which does exactly the same thing) is substantially different from reinterpret_cast. There is no guarantee that that the base subobject starts at the same address as the complete object.

Most implementations place the first base subobject at the same address as the complete object, but of course even such implementations cannot place two different non-empty base subobjects at the same address. (An object with virtual functions is not empty). When the base subobject is not at the same address as the complete object, static_cast is not a no-op, it involves pointer adjustment.

There are implementations that never place even the first base subobject at the same address as the complete object. It is allowed to place the base subobject after all members of derived, for example. IIRC the Sun C++ compiler used to layout classes this way (don't know if it's still doing that). On such an implementation, this code is almost guaranteed to fail.

Similar code with B having more than one base will fail on many implementations. Example.

answered Apr 5, 2019 at 11:36

n. m. could be an AI

122k14 gold badges141 silver badges267 bronze badges

Sign up to request clarification or add additional context in comments.

22 Comments

geza Over a year ago

You've just raised another problem :) While what you're saying is true, what if Base *base = &a; doesn't move the pointer (this is usually the case)? Or I put an if into my code, so it checks for equality: if ((void*)base==(void*)&a), and only if that's true, then the code does the following things (reinterpret_cast + impl. conversion). Would it be still UB?

n. m. could be an AI Over a year ago

As the other answer points out, b is not a safely derived pointer, and so neither is x. It is implementation defined wether unsafely derived pointers can be dereferenced. I have just clarified the reason why it isn't considered safely derived.

geza Over a year ago

It's strange that the standard says "A pointer value is a safely-derived pointer to a dynamic object". I don't know why dynamic is there. But anyways, checking the list under it, it seems that b is a safely-derived pointer, as it is a "the result of a reinterpret_cast of a safely-derived pointer value" (I think we can consider base as a safely-derived pointer).

n. m. could be an AI Over a year ago

Indeed it's my mistake, apparently safely derived means something other than I thought. I don't know whether this is UB with the additional conditions you provide.

n. m. could be an AI Over a year ago

@geza this answer explains why reinterpret_cast is not a standard-blessed way to cast within a class hierarchy. If you want to ask whether one can exploit properties of object layout in a particular implementation in order to get more use of reinterpret_cast than permitted by the standard, you may want to ask a separate question (the answer will be a resounding "no").

|

Community · Accepted Answer · 2020-06-20 09:12:55Z

1

The reinterpret_cast is valid (the result can be dereferenced) if the two classes are layout-compatible; that is

they both have standard layout,
they both have the same non-static data members

But the classes do not have standard layout because one of the requirements of StandardLayoutType it that the class has no virtual functions or virtual base classes.

Regarding the validity of pointers derived from conversions, the standard has this to say in the section on "Safely-derived pointers":

6.7.4.3 Safely-derived pointers

4. An implementation may have relaxed pointer safety, in which case the validity of a pointer value does not depend on whether it is a safely-derived pointer value. Alternatively, an implementation may have strict pointer safety, in which case a pointer value referring to an object with dynamic storage duration that is not a safely-derived pointer value is an invalid pointer value unless the referenced complete object has previously been declared reachable. [ Note: The effect of using an invalid pointer value (including passing it to a deallocation function) is undefined, see 6.7.4.2. This is true even if the unsafely-derived pointer value might compare equal to some safely-derived pointer value. —end note ] It is implementation-defined whether an implementation has relaxed or strict pointer safety.

edited Jun 20, 2020 at 9:12

CommunityBot

11 silver badge

answered Apr 5, 2019 at 9:42

P.W

26.9k6 gold badges42 silver badges81 bronze badges

5 Comments

BiagioF Over a year ago

However, the OP here does not directly access the memory with the result of reinterpret_cast. Another cast (static_cast) happens before the access.

P.W Over a year ago

But can another cast be used safely on the result of a cast that is invalid?

BiagioF Over a year ago

the result of the case is not "invalid". The value is still valid, but cannot be dereferenced (otherwise UB). (Same as nullptr, is valid value but cannot be dereferenced)

BiagioF Over a year ago

Still, the question remains. Can a "non-referenceable" be cast again into a "referenceable" one?

curiousguy Over a year ago

@BiagioFesta Yes with a cast back to the original type (X*->Y*->X*)

atomsymbol · Accepted Answer · 2019-04-08 19:17:58Z

0

If A and B are a verbatim copy of each other (except for their names) and are declared in the same context (same namespace, same #defines, no __LINE__ usage), then common C++ compilers (gcc, clang) will produce two binary representations which are fully interchangeable.

If A and B use the same method signatures but the bodies of corresponding methods differ, it is unsafe to cast A* to B* because the optimization pass in the compiler could for example partially inline the body of void B::method() at the call site b->method() while the programmer's assumption could be that b->method() will call A::method(). Therefore, as soon as the programmer uses an optimizing compiler the behavior of accessing A through type B* becomes undefined.

Problem: All compilers are always at least to some extent "optimizing" the source code passed to them, even at -O0. In cases of behavior not mandated by the C++ standard (that is: undefined behavior), the compiler's implicit assumptions - when all optimizations are turned off - might differ from programmer's assumptions. The implicit assumptions have been made by the developers of the compiler.

Conclusion: If the programmer is able to avoid using an optimizing compiler then it is safe to access A via B*. The only issue such a programmer needs to tackle with is that non-optimizing compilers do not exist.

A managed C++ implementation might abort the program when A* is casted to B* via reinterpret_cast, when b->field is accessed, or when b->method() is called. Some other managed C++ implementation might try harder to avoid a program crash and so it will resort to temporary duck typing when it sees the program accessing A via B*.

Some questions are:

Can the programmer guess what the managed C++ implementation will do in cases of behavior not mandated by the C++ standard?
What if the programmer sends the code to another programmer who will pass it to a different managed C++ implementation?
If a case isn't covered by the C++ standard, does it mean that a C++ implementation can choose to do anything it considers appropriate in order to cope with the case?

answered Apr 8, 2019 at 19:17

atomsymbol

4219 silver badges13 bronze badges

2 Comments

geza Over a year ago

My question has the tag of language-lawyer. This means that it doesn't matter what compilers do. The question is, what the standard says.

atomsymbol Over a year ago

@geza Yes, although on the other hand a programmer is always accessing the C++ standard via a particular C++ implementation. From language-lawyer viewpoint, the only correct acceptable answer is just a single line: "The standard does not cover the case." - adding anything on top of that is completely superfluous from language-lawyer viewpoint and you shouldn't have accepted it as the best answer to your question.

water · Accepted Answer · 2019-04-08 03:40:31Z

-1

Yes, It does have undefined behavior. The layout about suboject of Base in A and B is undefined. x may be not a real Base oject.

answered Apr 8, 2019 at 3:40

water

245 bronze badges

Collectives™ on Stack Overflow

Is converting a reinterpret_cast'd derived class pointer to base class pointer undefined behavior?

4 Answers 4

22 Comments

5 Comments

2 Comments

Comments

Your Answer

Linked

Hot Network Questions

Collectives™ on Stack Overflow

4 Answers 4

22 Comments

5 Comments

2 Comments

Comments

Your Answer

Sign up or log in

Post as a guest

Linked

Related