Basically it goes like this:
- To get correct alignment for members, struct/union needs padding bytes. The compiler inserts these automatically, as it pleases.
The C committee did not put any requirement on what value a padding byte must have. This was made on purpose, so that a program need not write to the padding bytes whenever a struct is initialized or copied. If the standard had required the padding bytes to have for example value zero, then this would have introduced an ever-so-slight execution overhead.
(This ever-so-slight potential performance gain is why structs have all these obscure mechanisms attached to them. This is truly the spirity of C - if we can make something ever so slightly faster, at the price of obfuscation and inconsistency, then make it so.)
- However, C also allows exotic systems to have trap representations - certain bit sequences that will yield a run-time exception of some kind, whenever they are read. If padding bytes were allowed to have any value, they might have ended up as trap representations. Therefore, there's an exception saying that a padding byte may have any value, but not the value of a trap representation.
Therefore: you cannot trust the padding bytes to have any given value (except they can't be trap representations). The value of the padding bytes may differ from case to case - there is no guarantee that their values are consistent.
Consider a plain 32 bit two's complement system with no trap representations:
typedef struct
{
uint8_t u8;
uint32_t u32;
} something_t;
something_t thing1 = {1, 2};
something_t thing2 = {3, 4}
Here, one possible memory layout would be this (hex, little endian):
01 AA BB CC 02 00 00 00 // thing 1
03 55 66 77 04 00 00 00 // thing 2
^ ^ ^
u8 padding u32
where in thing1, the 01 is the u8 member, the AA BB CC sequence is padding bytes with unspecified values and 02 00 00 00 is the u32 member.
If we now write thing1 = thing2, the compiler is allowed to do one of the following:
- copy the whole
thing2 and overwrite all of thing1, including padding, if this is the most efficient,
- or if more efficient, copy
u8 and u32 but don't write to the padding bytes, leaving them as they were, resulting in thing1 now having the memory layout 03 AA BB CC 04 00 00 00.
And this is actually the reason why we can't compare structs with neither the == operator nor functions like memcmp().