Struct member alignment -- different sizeof using 16-bit and 32-bit compiler

Question

I have a structure used to contruct messages to a control board I need to maintain software compatibility between a C167 16-bit Keil compiler and a 32-bit Tricore gcc compiler.

typedef struct
{
    unsigned char new_weld_status[2];
    UINT32 new_weld_count;
    UINT16 new_weld_fail_count;
} NEW_PULSE_DATA;

The array new_weld_status[2] takes up 2 bytes on the 16-bit compiler but 4 bytes on the 32-bit compiler. I was thinking of replacing all the new_weld_status[2] with a union when compiling with gcc. But is there a switch I can use for gcc that makes the chars fits/aligns in 2 bytes?

Thanks

@Someprogrammerdude If I had to guess (which I guess I have to since we don't have minimal reproducible example ), I'd say OP is taking the sizeof a pointer, perhaps decayed from an array. — Christian Gibbons
– Christian Gibbons, Commented May 16, 2018 at 22:03
Check out The Lost Art of C Structure Packing for details on why this happens and how to avoid it. — dbush
– dbush, Commented May 17, 2018 at 2:58
No the array takes up 2 bytes on both systems. But on the 32 bit system you get padding inserted after the array. The solution is to not write code that depends on a certain memory layout of the struct. It is not obvious to me why this code would not be portable, unless you do stupid things with it like memcpy(&the_struct, something, 8), in which case the solution is to correct the code that does stupid things. — Lundin
– Lundin, Commented May 17, 2018 at 8:28
@Acorn Using structs to achieve a certain memory layout is non-portable. — Lundin
– Lundin, Commented May 17, 2018 at 12:34

ChuckCottrill · Accepted Answer · 2018-05-16 23:27:43Z

6

Note that your structure layout creates the problem on a 32-bit system. Many (most) 32-bit CPU architectures require 4-byte alignment for 32-bit words, thus the new_weld_count requires 'padding' to provide proper memory alignment.

typedef struct
{
    unsigned char   new_weld_status[2];   //a
    //char padding_1[2]; //hidden padding
    UINT32  new_weld_count;               //a
    UINT16  new_weld_fail_count;          //a
} NEW_PULSE_DATA;

The following redefinition of your structure completely avoids the problem.

typedef struct
{
    UINT32  new_weld_count;               //a
    UINT16  new_weld_fail_count;          //a
    unsigned char   new_weld_status[2];   //a
} NEW_PULSE_DATA;
NEW_PULSE_DATA ex_PULSE_DATA;

However, the above approach is not the approach typically to transport struct(ured) data across networks/over message transports. A more common and much better approach is to use a serialization/deserialization layer (aka marshalling) to place the structures into 'over the wire' formats. Your current approach is conflating the in-memory storage and addressing with the communication format.

//you need to decide on the size of wire format data,
//Both ends of the protocol must agree on these sizes,
#define new_weld_count_SZ sizeof(ex_PULSE_DATA.new_weld_count)
#define new_weld_fail_count_SZ sizeof(ex_PULSE_DATA.new_weld_fail_count)
#define new_weld_status_SZ sizeof(ex_PULSE_DATA.new_weld_status)

//Then you define a network/message format
typedef struct
{
    byte new_weld_count[new_weld_count_SZ];
    byte new_weld_fail_count[new_weld_count_SZ];
    byte new_weld_status[new_weld_count_SZ];
} MESSAGE_FORMAT_PULSE_DATA;

Then you would implement serialization & deserialization functions on both ends of the transport. The following example is simplistic, but conveys the gist of what you need.

byte*
PULSE_DATA_serialize( MESSAGE_FORMAT_PULSE_DATA* msg, NEW_PULSE_DATA* data )
{
    memcpy(&(msg->new_weld_count), data->new_weld_count, new_weld_count_SZ);
    memcpy(&(msg->new_weld_fail_count), data->new_weld_fail_count, new_weld_fail_count_SZ);
    memcpy(&(msg->new_weld_status), data->new_weld_status, new_weld_status_SZ);
    return msg;
}

NEW_PULSE_DATA*
PULSE_DATA_deserialize( NEW_PULSE_DATA* data, MESSAGE_FORMAT_PULSE_DATA* msg )
{
    memcpy(data->new_weld_count, &(msg->new_weld_count), new_weld_count_SZ);
    memcpy(data->new_weld_fail_count, &(msg->new_weld_fail_count), new_weld_fail_count_SZ);
    memcpy(data->new_weld_status, &(msg->new_weld_status), new_weld_status_SZ);
    return msg;
}

Note that I have omitted the obligatory network byte order conversions, because I assume your have already worked out your byte order issues between the two cpu domains. If you have not considered byte-order (big-endian vs. little-endian), then you need to address that issue as well.

When you send a message, the sender does the following,

//you need this declared & assigned somewhere
NEW_PULSE_DATA data;
//You need space for your message
MESSAGE_FORMAT_PULSE_DATA msg;
result = send(PULSE_DATA_deserialize( &data, &msg ));

When you receive a message, the recipient does the following,

//recipient needs this declared somewhere
NEW_PULSE_DATA data;
//Need buffer to store received data
MESSAGE_FORMAT_PULSE_DATA msg;
result = receive(&msg,sizeof(msg));
//appropriate receipt checking here...
PULSE_DATA_deserialize( &data, &msg );

edited May 16, 2018 at 23:27

answered May 16, 2018 at 22:57

ChuckCottrill

4,4792 gold badges30 silver badges45 bronze badges

Sign up to request clarification or add additional context in comments.

8 Comments

PinGNU Over a year ago

I can not change either method the receiving WinCE machine expects structure in order so all the 16bit machines would need new code but I am going to rewrite all the code and switch to Win embedded arm so will keep this post thanks

Acorn Over a year ago

While these are good points to know about, the question is not about reordering the struct or marshalling.

Jubatian Over a year ago

@PinGNU I guess that reveiving machine is a third one in the scenario (compared to the C167 and the TriCore mentioned in the question). You don't need to do anything with that: the binary format used on the communications protocol is there, settled. You only need to get the C167 & the TriCore to adhere to it by appropriate code which generates the right binary stream on both. To solve this problem, serializing & deserializing is a proper solution (on the C167 / TriCore end).

Lundin Over a year ago

@Acorn No the question is about "the array takes up 4 bytes" which is first of all incorrect, and second the padding should not matter. So the actual problem is the OP doing something icky with the struct, such as sending the whole struct as a data protocol, which comments indicate. And which is always non-portable bad practice. The correct solution is to serialize the struct.

too honest for this site Over a year ago

@Acorn: There is no necessity serialisation uses a single instruction more than such bad code. In the worst case, provide a set of serialisation functions hand-optimised. But that would be the very last step if there truely is a performance/space issue for a particular platform. Until then, leave optimisations to the compiler and write clean, maintainable code.

|

KamilCuk · Accepted Answer · 2018-05-16 22:32:30Z

Union wouldn't change the members alignment inside a struct. You are interested in padding. The compiler may insert any number of bytes/bits between struct members to satisfy alignment requiremens. On gcc compatible compilers you may use __attribute__((__packed__)) as Acorn already pointed out, but this does not take care of endianess. The most compatible version between platforms (including platforms with different alignment and different endianess) would be to use (sadly!) get/set functions that look like this:

typedef struct {
    unsigned char data[2+4+2];
} NEW_PULSE_DATA;

unsigned char NEW_PULSE_DATA_get_new_weld_status(NEW_PULSE_DATA *t, size_t idx) {
    return t->data[idx];
}

void NEW_PULSE_DATA_set_new_weld_status(NEW_PULSE_DATA *t, size_t idx, unsigned char value) {
    t[idx] = value;
}

UINT32 NEW_PULSE_DATA_get_new_weld_count(NEW_PULSE_DATA *t) {
    return (UINT32)t->data[2]<<24
        | (UINT32)t->data[3]<<16 
        | (UINT32)t->data[4]<<8 
        | (UINT32)t->data[5];
}

void NEW_PULSE_DATA_set_new_weld_count(NEW_PULSE_DATA *t, UINT32 val) {
    t->data[2] = val>>24;
    t->data[3] = val>>16;
    t->data[4] = val>>8;
    t->data[5] = val;
}

UINT16 NEW_PULSE_DATA_get_new_weld_fail_count(NEW_PULSE_DATA *t) {
    return (UINT16)t->data[6]<<8
        | (UINT16)t->data[7];
}

void NEW_PULSE_DATA_set_new_weld_fail_count(NEW_PULSE_DATA *t, UINT16 val) {
    t->data[6] = val>>8;
    t->data[7] = val;
}

This is the only "good" way of being 100% sure, that NEW_PULSE_DATA looks exactly the same on different platforms (at least on platforms with the same number of bits per char/CHAR_BIT value). However sizeof(NEW_PULSE_DATA) may be still different between platforms, because compiler may insert padding on the end of the struct (after the last member of the structure). So you may want to change NEW_PULSE_DATA type to be just an array of bytes:

typedef unsigned char NEW_PULSE_DATA[2+4+2];

unsigned char NEW_PULSE_DATA_get_new_weld_status(NEW_PULSE_DATA t, size_t idx) {
    return t[idx];
}

unsigned char NEW_PULSE_DATA_set_new_weld_status(NEW_PULSE_DATA t, size_t idx, unsigned char value) {
    t[idx] = value;
}

UINT32 NEW_PULSE_DATA_get_new_weld_count(NEW_PULSE_DATA t) {
    return (UINT32)t[2]<<24
        | (UINT32)t[3]<<16 
        | (UINT32)t[4]<<8 
        | (UINT32)t[5];
}

void NEW_PULSE_DATA_set_new_weld_count(NEW_PULSE_DATA t, UINT32 val) {
    t[2] = val>>24;
    t[3] = val>>16;
    t[4] = val>>8;
    t[5] = val;
}

UINT16 NEW_PULSE_DATA_get_new_weld_fail_count(NEW_PULSE_DATA t) {
    return (UINT16)t[6]<<8
        | (UINT16)t[7];
}

void NEW_PULSE_DATA_set_new_weld_fail_count(NEW_PULSE_DATA t, UINT16 val) 
{
    t[6] = val>>8;
    t[7] = val;
}

Just that: never use home-brew fixed width types. The standard provides stdint.h for good reasons - use it.

Acorn · Accepted Answer · 2018-05-16 22:13:44Z

1

For gcc and other compilers, you can use __attribute__((packed)):

This attribute, attached to struct or union type definition, specifies that each member (other than zero-width bit-fields) of the structure or union is placed to minimize the memory required.

answered May 16, 2018 at 22:13

Acorn

26.4k7 gold badges50 silver badges78 bronze badges

15 Comments

Andrew Henle Over a year ago

__attribute__((packed)) is a good way to wind up with code that doesn't run on machines that actually have alignment restrictions on data. Not every machine out there is x86-based, and given the machines here are likely 16- and 32-bit microcontrollers, I'd venture that __attribute__(packed) is more likely than not to cause problems.

Lundin Over a year ago

This does not answer the question. Why are you on about gcc? The OP is not using gcc but Keil.

Lundin Over a year ago

Yeah and how exactly is __attribute__ from gcc portable to Keil?

too honest for this site Over a year ago

Packing should work on all platforms supporteing it. Nevertheless it can result in catastrophic performance, because not aligned members have to be accessed by multiple instructions which assemble/split the parts for each access to the data. The correct way is to use proper marshalling at a single point and otherwise use the natural alignment (i.e. avoid packing). Packing also does not solve endianess and encoding problems.

Acorn Over a year ago

@Olaf: Indeed, it may or may not result in bad performance; and the loss of performance may or may not matter for a particular system. We simply don't know enough to decide. Anyway, the question was about how to force the compiler to pack a struct, nothing else. Endianness and encoding have nothing to do with this, either.

|

Collectives™ on Stack Overflow

Struct member alignment -- different sizeof using 16-bit and 32-bit compiler

3 Answers 3

8 Comments

1 Comment

15 Comments

Your Answer

Linked

Hot Network Questions

Collectives™ on Stack Overflow

3 Answers 3

8 Comments

1 Comment

15 Comments

Your Answer

Sign up or log in

Post as a guest

Linked

Related