10

I recently saw a class like this that was used to construct objects "on-demand" without having to use dynamic memory allocation for various reasons.

#include <cassert>

template<typename T>
class StaticObject
{
public:
    StaticObject() : constructed_(false)
    {
    }

    ~StaticObject()
    {
        if (constructed_)
            ((T*)object_)->~T();
    }

    void construct()
    {
        assert(!constructed_);

        new ((T*)object_) T;
        constructed_ = true;
    }

    T& operator*()
    {
        assert(constructed_);

        return *((T*)object_);
    }

    const T& operator*() const
    {
        assert(constructed_);

        return *((T*)object_);
    }

private:
    bool constructed_;
    alignas(alignof(T)) char object_[sizeof(T)];
};

Is this code, namely the casting of a properly aligned char array to an object pointer, considered undefined behavior by the C++14 standard or is it completely fine?

12
  • It is fine if you do it carefully. Besides, there are std::optional in C++17 and boost::optional doing similar thing. But I would leave it to people who know better to examine this specific case. Commented Jul 8, 2018 at 12:05
  • 1
    Use placement new. This will avoid any UB. Commented Jul 8, 2018 at 12:05
  • 5
    @RichardCritten isn't new ((T*)object_) T; exactly a placement new ? Commented Jul 8, 2018 at 12:17
  • 1
    Side note: cast is not necessary as placement new accepts a void pointer anyway... Commented Jul 8, 2018 at 12:22
  • 1
    There's a standard class intended to be used this way... doesn't directly answer the question w.r.t char, does prove that the standard expects at least something along these lines to be possible. Commented Jul 8, 2018 at 12:30

4 Answers 4

10

This program technically has undefined behavior, although it's likely to work on most implementations. The issue is that a cast from char* to T* is not guaranteed to result in a valid pointer to the T object created by placement new, even though the char* pointer represents the address of the first byte used for storage for the T object.

[basic.compound]/3:

Pointers to layout-compatible types shall have the same value representation and alignment requirements ([basic.align]).

In general, T will not be layout-compatible with char or with alignas(T) char[sizeof(T)], so there's no requirement that a pointer T* has the same value representation as a pointer char* or void*.

[basic.compound]/4:

Two objects a and b are pointer-interconvertible if:

  • they are the same object, or

  • one is a union object and the other is a non-static data member of that object ([class.union]), or

  • one is a standard-layout class object and the other is the first non-static data member of that object, or, if the object has no non-static data members, any base class subobject of that object ([class.mem]), or

  • there exists an object c such that a and c are pointer-interconvertible, and c and b are pointer-interconvertible.

If two objects are pointer-interconvertible, then they have the same address, and it is possible to obtain a pointer to one from a pointer to the other via a reinterpret_cast. [ Note: An array object and its first element are not pointer-interconvertible, even though they have the same address. — end note ]

[Aside: DR 2287 changed "standard-layout union" to "union" in the second bullet after the publication of C++17. But that doesn't affect this program.]

The T object created by the placement new is not pointer-interconvertible with object_ or with object_[0]. And the note hints that this might be a problem for casts...

For the C-style cast ((T*)object_), we need to see [expr.cast]/4:

The conversions performed by

  • a const_cast,

  • a static_cast,

  • a static_cast followed by a const_cast,

  • a reinterpret_cast, or

  • a reinterpret_cast followed by a const_cast

can be performed using the cast notation of explicit type conversion....

If a conversion can be interpreted in more than one of the ways listed above, the interpretation that appears first in the list is used, even if a cast resulting from that interpretation is ill-formed.

Unless T is char or cv-qualified char, this will effectively be a reinterpret_cast, so next we look at [expr.reinterpret.cast]/7:

An object pointer can be explicitly converted to an object pointer of a different type. When a prvalue v of object pointer type is converted to the object pointer type "pointer to cv T", the result is static_­cast<cv T*>(static_­cast<cv void*>(v)).

So first we have a static_cast from char* to void*, which does the standard conversion described in [conv.ptr]/2:

A prvalue of type "pointer to cv T", where T is an object type, can be converted to a prvalue of type "pointer to cv void". The pointer value ([basic.compound]) is unchanged by this conversion.

This is followed by a static_cast from void* to T*, described in [expr.static.cast]/13:

A prvalue of type "pointer to cv1 void" can be converted to a prvalue of type "pointer to cv2 T", where T is an object type and cv2 is the same cv-qualification as, or greater cv-qualification than, cv1. If the original pointer value represents the address A of a byte in memory and A does not satisfy the alignment requirement of T, then the resulting pointer value is unspecified. Otherwise, if the original pointer value points to an object a, and there is an object b of type T (ignoring cv-qualification) that is pointer-interconvertible with a, the result is a pointer to b. Otherwise, the pointer value is unchanged by the conversion.

As already noted, the object of type T is not pointer-interconvertible with object_[0], so that sentence does not apply, and there's no guarantee that the result T* points at the T object! We're left with the sentence saying "the pointer value is unchanged", but this might not be the result we want if the value representations for char* and T* pointers are too different.

A Standard-compliant version of this class could be implemented using a union:

template<typename T>
class StaticObject
{
public:
    StaticObject() : constructed_(false), dummy_(0) {}
    ~StaticObject()
    {
        if (constructed_)
            object_.~T();
    }
    StaticObject(const StaticObject&) = delete; // or implement
    StaticObject& operator=(const StaticObject&) = delete; // or implement

    void construct()
    {
        assert(!constructed_);

        new(&object_) T;
        constructed_ = true;
    }

    T& operator*()
    {
        assert(constructed_);

        return object_;
    }

    const T& operator*() const
    {
        assert(constructed_);

        return object_;
    }

private:
    bool constructed_;
    union {
        unsigned char dummy_;
        T object_;
    }
};

Or even better, since this class is essentially attempting to implement an optional, just use std::optional if you have it or boost::optional if you don't.

Sign up to request clarification or add additional context in comments.

4 Comments

So summary is: You cannot cast a unsigned char[]-storage to a T* because they are (generally) not pointer-interconvertible, correct? How does that cover the use cases of aligned_storage? Prior to C++17 (requirement to use std::launder) this seemed to be a very valid use case. I think the rules are there to cover the alignment issue as indicated by does not satisfy the alignment requirement of T, then the resulting pointer value is unspecified. See also the answer of @eerorika
@Flamefire The alignment requirement is separate from the pointer-interconvertible requirement. Yes, before C++17 this was fine given relaxed pointer safety, because of [basic.compound]/3 "If an object of type T is located at an address A, a pointer of type cv T* whose value is the address A is said to point to that object, regardless of how the value was obtained." And there was no "pointer-interconvertible", so the cast is defined to preserve the address.
@Flamefire But also, the cast produces a pointer which points at the T but is not a safely-derived pointer value. So even in C++11/C++14, if the implementation has strict pointer safety, that pointer is invalid. The fix would be std::declare_reachable. I'm not sure std::launder actually fixes either issue in C++17, since there's no longer a guarantee the result of the cast represents the same address in the first place.
Isn't std::declare_reachable for GC? And I do think "the result of the cast represents the same address", see my answer. Please check if I missed anything but I believe it to be correct. I tried to collect all pieces and connect them to why it works. Otherwise commonly used constructs won't be valid and std::aligned_storage would be useless.
8

Casting a char array to an object pointer - is this UB?

Casting one pointer (the array decays to a pointer) to another pointer that is not in same inheritance hierarchy using a C-style cast performs a reinterpret cast. A reinterpret cast itself never has UB.

However, indirecting a converted pointer can have UB if an object of appropriate type has not been constructed into that address. In this case, an object has been constructed in the character array, so the indirection has well defined behaviour. Edit: The indirection would be UB free, if it weren't for the strict aliasing rules; see ascheplers answer for details. aschepler shows a C++14 conforming solution. In C++17, your code can be corrected with following changes:

void construct()
{
    assert(!constructed_);
    new (object_) T; // removed cast
    constructed_ = true;
}

T& operator*()
{
    assert(constructed_);
    return *(std::launder((T*)object_));
}

To construct an object into an array of another type, three requirements must be met to avoid UB: The other type must be allowed to alias the object type (char, unsigned char and std::byte satisfy this requirement for all object types), the address must be aligned to the memory boundary as required by the object type and none of the memory must overlap with the lifetime of another object (ignoring the underlying objects of the array which are allowed to alias the overlaid object). All of those requirements are satisfied by your program.

8 Comments

Your second paragraph is wrong, because you're violating struct aliasing (object_ doesn't have type T*)
@Rakete1111 It doesn't need to have type T*. Reusing char storage is allowed.
Reusing char storage is allowed yes, but what I'm saying is that you can cast an object pointer to char* (and back) but if you start with a char*, then you cannot get an object pointer from it, even with new.
@Rakete1111 That's incorrect. If reusing char storage is allowed (and it is), then how do you propose to access the reused object if "you cannot get an object pointer from it" were true?
@Rakete1111 I did a bit of researching, and it seems you're correct, though I find the rules frustrating. As I understand, OP's code could be fixed by removing the bad cast in construct, and by adding std::launder before indirecting the converted pointer. But std::launder is not in C++14 :(
|
0

After writing the comment to @aschepler answer I think I found the proper answer:

No it is not UB!

Very strong hint: aligned_storage is exactly for doing that.

  • basic.compound[4] gives us the definition of "pointer-interconvertible". None of the cases apply so T* and unsigned char[...] are not pointer-interconvertible.
  • conv.ptr[2] and expr.static.cast[13] tells us what happens on the reinterprer_cast<T*>(object_). Basically the (intermediate) cast to void* does not change the value of the pointer and the cast from void* to T* does also not change it:

    If the original pointer value represents the address A of a byte in memory and A does not satisfy the alignment requirement of T, then the resulting pointer value is unspecified. Otherwise, if the original pointer value points to an object a, and there is an object b of type T (ignoring cv-qualification) that is pointer-interconvertible with a, the result is a pointer to b. Otherwise, the pointer value is unchanged by the conversion.

    We have a properly aligned, not pointer-interconvertible type here. Hence unchanged value.

  • Now before P0137 (found in another answer) basic.compound[3] said:

    If an object of type T is located at an address A, a pointer of type cv T* whose value is the address A is said to point to that object, regardless of how the value was obtained.

    Now it says basic.compound[3]

    Every value of pointer type is one of the following:

    (3.1) a pointer to an object or function (the pointer is said to point to the object or function), [...]

    Which I consider equivalent for this purpose.

  • Finally we need basic.lval[11]

    If a program attempts to access the stored value of an object through a glvalue whose type is not similar ([conv.qual]) to one of the following types the behavior is undefined:52 [...]

    (11.3) a char, unsigned char, or std​::​byte type.

    This boils down to the aliasing rules which only allow certain types to alias and our unsigned char is part of it.

So in summary:

  • We meet the alignment and aliasing rules
  • We get a defined pointer value to a T* (which is the same as the unsigned char*)
  • Hence we have a valid object at that place

This is basically what @eerorika also has. But I think from the arguments above that the code is fully valid at least if T does not have any const are reference members in which case std::launder must be used. Even then, if the memory is not reused (but only used for creating 1 T) then it should also be valid.

However older GCC (<7.2) complains about strict-aliasing violation: https://godbolt.org/z/Gjs05C although the docu states:

For example, an unsigned int can alias an int, but not a void* or a double. **A character type may alias any other type. **

This is a bug

7 Comments

The aliasing rules are not symmetric. Accessing any type via unsigned char is allowed, but not the opposite. So it's illegal to use a cast from buffer to even types like int if no such int objects have been created there. There are four options for pointer values in basic.compound[3], so you need to prove the result of the cast is a "pointer to an object" and not an "invalid pointer value". Even if the pointer does "represent the address" of the object, that no longer implies it "points at" the object. Your interpretation seems to imply "pointer-interconvertible" makes no difference.
With the first 2 points it is guaranteed that the cast will yield the same pointer value as the unsigned char array. The placement new places an object T there. So using that buffer afterwards casted to T* is valid, isn't it? As for the aliasing rule see my last point: It allows to access any T via unsigned char*. The placement new created an object there so this is valid again. "pointer-interconvertible" does make a difference: If the types were "pointer-interconvertible" then the address might change (which is unwanted here)
I'm now convinced "the pointer value is unchanged" refers to the possible values described in [basic.compound]/3, not to the value representation. So reinterpret_cast, via the static_cast from void*, can give a pointer of type X* which points at an object of type Y, with the two types unrelated. This is the meaning implied by the comments in the first example at en.cppreference.com/w/cpp/language/reinterpret_cast#Notes ... So my answer does need some edits.
So in the OP, ((T*)object_) points at the char object object_[0], and does not point at the object of type T created by placement new. Accessing via that pointer is an aliasing violation, since it's accessing a char via type T, although accessing any object via char is allowed. The only technically correct ways to do this, including uses of std::aligned_storage, are using std::launder, using the union technique to make the objects pointer-interconvertible, or separately storing the pointer value result of the placement-new expression (which is usually wasteful).
On reddit I found "§18.6.2.3/1-2 make it clear that placement-new always returns exactly the address passed to it". So we can be sure that placementNewResultPtr == ((T*)object_). basic.lval[11] allows us to "access the stored value of T through a glvalue of type unsigned char`. So why is this an aliasing violation? Especially if we only ever use the casted-T pointer we should also not run into lifetime issues.
|
-1

You do have undefined behavior.

object_ isn't a T*, so casting and dereferencing it is UB. You cannot use object_ to refer to the newly created object. This is also known as strict aliasing.

The fix however is easy: Just create a new member variable T* that you use to access the constructed object. Then you need to assign the result of placement new to that pointer:

ptr = new(object_) T;

[basic.life]p1 says:

The lifetime of an object o of type T ends when:

  • if T is a class type with a non-trivial destructor, the destructor call starts, or

  • the storage which the object occupies is released, or is reused by an object that is not nested within o.

So by doing new (object_) T;, you are ending the lifetime of the original char[] object and are starting the lifetime of the new T object we'll call t.

Now we have to examing whetherr *((T*)object_) is valid.

[basic.life]p8 with the important bits highlighted:

If, after the lifetime of an object has ended and before the storage which the object occupied is reused or released, a new object is created at the storage location which the original object occupied, a pointer that pointed to the original object, a reference that referred to the original object, or the name of the original object will automatically refer to the new object and, once the lifetime of the new object has started, can be used to manipulate the new object, if:

  • the storage for the new object exactly overlays the storage location which the original object occupied, and

  • the new object is of the same type as the original object (ignoring the top-level cv-qualifiers), and

  • the type of the original object is not const-qualified, and, if a class type, does not contain any non-static data member whose type is const-qualified or a reference type, and

The second point is not true (T vs char[]), so you cannot use object_ as a pointer to the newly created object t.

5 Comments

Sorry, but I still not get why casting and dereferencing would be UB if pointing to a validly constructed object. See also stackoverflow.com/a/31615329/3723423 . Please provide references to the standard to demonstrate your case.
@Rakete1111 There is no T object. this is wrong. There is a T object.The lifetime of that object begins on this line: new ((T*)object_) T;. What do you think your ptr = new(object_) T; does that the original line doesn't? That said, I would prefer your code since the cast in original code is redundant.
@user2079303 Yeah, the sentence is confusing. What I mean is that you cannot use object_ to access the new object.
@Christophe Done :) Your linked answer is wrong unfortunately.
Rakete1111 is right. You can not just take the pointer and cast it. You need std::launder to get defined behaviour again. an example is given here: en.cppreference.com/w/cpp/types/aligned_storage

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.