1

From the C11 standard

I just read this in C11 standard and not able to understand why value stored in an object of struct/union, object representation of padding bytes take unspecified values?

I'm aware unspecified means where standard imposes no requirements.


When a value is stored in an object of structure or union type, including in a member object, the bytes of the object representation that correspond to any padding bytes take unspecified values.

The value of a structure or union object is never a trap representation, even though the value of a member of the structure or union object may be a trap representation

Can someone give me an example to understand this statement better?

0

3 Answers 3

4

Basically it goes like this:

  • To get correct alignment for members, struct/union needs padding bytes. The compiler inserts these automatically, as it pleases.
  • The C committee did not put any requirement on what value a padding byte must have. This was made on purpose, so that a program need not write to the padding bytes whenever a struct is initialized or copied. If the standard had required the padding bytes to have for example value zero, then this would have introduced an ever-so-slight execution overhead.

    (This ever-so-slight potential performance gain is why structs have all these obscure mechanisms attached to them. This is truly the spirity of C - if we can make something ever so slightly faster, at the price of obfuscation and inconsistency, then make it so.)

  • However, C also allows exotic systems to have trap representations - certain bit sequences that will yield a run-time exception of some kind, whenever they are read. If padding bytes were allowed to have any value, they might have ended up as trap representations. Therefore, there's an exception saying that a padding byte may have any value, but not the value of a trap representation.

Therefore: you cannot trust the padding bytes to have any given value (except they can't be trap representations). The value of the padding bytes may differ from case to case - there is no guarantee that their values are consistent.

Consider a plain 32 bit two's complement system with no trap representations:

typedef struct
{
  uint8_t  u8;
  uint32_t u32;
} something_t;

something_t thing1 = {1, 2};
something_t thing2 = {3, 4}

Here, one possible memory layout would be this (hex, little endian):

01 AA BB CC 02 00 00 00  // thing 1
03 55 66 77 04 00 00 00  // thing 2
^  ^        ^
u8 padding  u32

where in thing1, the 01 is the u8 member, the AA BB CC sequence is padding bytes with unspecified values and 02 00 00 00 is the u32 member.

If we now write thing1 = thing2, the compiler is allowed to do one of the following:

  • copy the whole thing2 and overwrite all of thing1, including padding, if this is the most efficient,
  • or if more efficient, copy u8 and u32 but don't write to the padding bytes, leaving them as they were, resulting in thing1 now having the memory layout 03 AA BB CC 04 00 00 00.

And this is actually the reason why we can't compare structs with neither the == operator nor functions like memcmp().

Sign up to request clarification or add additional context in comments.

Comments

3

Assume sizeof(long)=8, sizeof(int)=4, 64 bit machine:

struct {
    int a;
    long b;
} obj = {0, 0};

It makes sense to have 4 bytes of padding between &obj.a and &obj.b. What should be the content of the padding? Why force the runtime to put anything there? The layout of obj may as well be 0x00000000, 0xDEADBEEF, 0x00000000, 0x00000000 - i.e. 4 bytes of garbage, since they need not be accessed (and therefore shouldn't).

In general, the question goes the other way - if you think it should be specified (i.e. you want to force compiler writers do more work, and potentially less efficient work) it is you who should explain why.

Comments

2

Well the thing is you never know how the elements of a structure or union are laid out in the memory for sure beforehand. The padding bits may differ system to system.

Also you can consider the compiler's approach toward this. The optimization realized to lay out them is different. In some compiler it will be one and then it differs in another.

A trap representation is bit patterns that fit into the space occupied by a type, but trigger undefined behavior if used as a value of that type.

For example:-

struct a{
   int a;
   char b;
};

Here the size of the structure is not necessarily size of a and b. The padding is being added and that is thing that the standard dont comment over. You want to work with structure as a whole the representation works. But if you want to access the elements with a strict 8 byte representation it may fail.

Comments

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.