2

offsetof is defined like this in stddef.h:

#define offsetof(type, member) ((size_t)&((type *)0)->member)

Does this invoke undefined behavior due to the dereference of a NULL pointer? If not, why?

17
  • 3
    there is no dereference of a NULL pointer Commented Aug 3, 2019 at 21:15
  • 5
    It has & in front of it. That's important. Commented Aug 3, 2019 at 21:18
  • 2
    The contents of <stddef.h> are part of a C implementation, not part of a program. Asking whether it is undefined is like asking whether some assembly language that happens to be part of the source of a compiler has behavior not defined by the C standard—of course it has undefined behavior, because it is not covered by the standard. Normally <stddef.h> is designed in conjunction with a compiler. Unless your situation is you are trying to implement your own <stddef.h> using a compiler that you do not control and can only rely on for what the standard specifies, the question is misplaced. Commented Aug 3, 2019 at 21:28
  • 3
    Related: c-faq.com/struct/offsetof.html Commented Aug 3, 2019 at 21:35
  • 1
    @user3386109 That's asking why it works. I'm asking if it's undefined behavior (and if not, why). Commented Aug 3, 2019 at 21:46

2 Answers 2

6

In normal C code, the behavior of ((size_t)&((type *)0)->member) is not specified by the C standard:

  • First, per C 2018 6.5.2.3 4, about ->, ((type *)0)->member designates the lvalue of the member member of the structure to which (type *)0 points. But ((type *)0) does not point to a structure, and therefore there is no member this can be the lvalue of.
  • Supposing it does give an lvalue for some hypothetical structure, there is no guarantee that taking its address and converting it to size_t yields the offset of the member, both because we do not know that (type *)0 yields an address that is actually represented with zero in the implementation’s addressing scheme and because the conversion of a pointer to an integer specified by C 2018 6.3.2.3 6 only tells us the result is implementation-defined, not that it yields the address in any otherwise meaningful form.

Were this code in a standard header, such as <stddef.h>, it is under the control of the C implementation and not the C standard, and so questions about whether it is undefined according to the C standard do not apply. The C standard only says how the standard headers behave when included—an implementation may use any means it chooses to achieve the required effects, whether that is simply defining the behavior of source code that is not fully defined by the C standard or putting source code in an entirely different language in the headers. (In fact, the file stddef.h could be entirely empty or not exist at all, and the compiler could supply its required declarations when it sees #include <stddef.h> without reading any actual file from disk.)

Sign up to request clarification or add additional context in comments.

10 Comments

When in <stddef.h>, " it is under the control of the C implementation" so no UB - yes.
Nitpick: No pointer-to-integer conversion occurs in this code fragment. (type *)0 produces the NULL pointer with type type. Applying -> to a null pointer is a dereference and it has undefined behavior. The special cases in 6.5.3.2 for &*ptr and &ptr[i] do not apply, so the presence of & is irrelevant.
@zwol Interestingly port70.net/~nsz/c/c11/n1570.html#6.6p9 mentions that -> with & may be used in the creation of an address constant provided no object is accessed. That would imply -> in this context is perhaps not really a dereference (=object access?).
@PSkocik No, if the committee had meant -> not to be a dereference when it's underneath & (in the expression tree), they would have said so in 6.5.3.2, as they did for &*ptr and &ptr[i]. (There is a colorable argument that this is an oversight and someone should file a DR, though.)
@zwol Given that member is of type int, and located at offset x in the structure, Is not &ptr->member equivalent to &*(int *)((char *)ptr + x)?
|
-3

Leaving aside all the other reasons why it might not be a correct implementation of offsetof,

#define offsetof(type, member) ((size_t)&((type *)0)->member)

is not appropriate even as part of the implementation because everything in stddef.h must work correctly in both C and C++, and in C++, the above construct definitely will misbehave in the presence of overloaded operator->. This is why GCC's stddef.h switched to using a special intrinsic function called __builtin_offsetof upwards of fifteen years ago.

Yes, I am saying that if you saw this in some stddef.h, that stddef.h is buggy.

8 Comments

Well, to be specific, I saw this: #if __GNUC__ > 3 #define offsetof(type, member) __builtin_offsetof(type, member) #else #define offsetof(type, member) ((size_t)&((type *)0)->member) #endif (quoted wrong version)
Still buggy. That construct is a little different but it has all the problems of the one you originally quoted. This simply can't be done in C++ without a dedicated compiler intrinsic. Whatever implementation this is should have an #error in its fallback case.
There is no compulsion that the C++ compiler be able to use a header provided by a C compiler; the only requirement is that the <stddef.h> provided by the C++ compiler must be acceptable to the C++ compiler (and the <stddef.h> provided by the C compiler must be acceptable to the C compiler, of course).
Since when did a language-lawyer question defer to reality or C++, @JL2210? Especially a 'C language only' language-lawyer question.
@JL2210: Even were it true that, in practice, a paired C implementation and C++ implementation used a common file named stddef.h to implement <stddef.h>, surely there are C implementations that are not paired with C++ implementations, and hence the assertion in this answer that code in stddef.h must work correctly in both C and C++ is false.
|

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.