5

I'm new in a company where the following use of a struct is done:

#include <stdio.h>
#include <string.h>

typedef unsigned char uint8;
typedef signed char int8;
typedef unsigned short int uint16;
typedef signed short int int16;

typedef struct padded_t {
    int8 array0[11];
    int8 array1[4];
    uint16 len2;
    int8 array2[25];
    uint16 len3;
    int8 array3[6];
    // many more len/arrays follow
} padded_t;

int main(int argc, char** argv)
{
    padded_t foo;
    memset((void*)&foo, 0, sizeof(padded_t));

    int8* str = "foobar";
    int16 size = (int16)strlen(str);

    int8* ptr = (int8*)&foo.len2;

    // please note that the memcpy references only the pointer to len
    // are the following couple of lines safe?
    memcpy ((void*)ptr, &size, 2);
    memcpy ((void*)&ptr[2], (void*)str, size);

    printf("%d\n", foo.len2);
    printf("%s\n", foo.array2);

    return 0;
}

I know some things about alignment and padding, and I assume the compiler (gnu C99 for an ARM9 device) will add some paddings to make the struct aligned.

But is this code safe?
As I understand it, it will be safe as long as the uint16 len variables are immediately followed by int8 array[] variables regardless of the other struct members.

Will it only add paddings before a uint16 when the size before it is odd?
Is this use correct? And more importantly, is it safe?

22
  • 2
    Get a new job! 1) Homebrew fixed width types instead of using standard types. 2) not in a header, but the implementation 3) not _static_asserted 4) Relying on implementation details/UB. 5) Wildly casting 6) Undefined behaviour. Commented Aug 10, 2016 at 17:32
  • 1
    @Olaf Expand on 6, please.. Commented Aug 10, 2016 at 17:33
  • 1
    @chqrlie But you can contribute much and get some respect and recognition. Commented Aug 10, 2016 at 17:40
  • 4
    The C standard guarantees that the members of a struct will be stored in the order they are declared, but says nothing about the alignment. The compiler is free to add any sort of padding it deems necessary between struct members to align them as needed. If this code works at all, it is very fragile and will break if ever ported to a different architecture or even a different compiler. Commented Aug 10, 2016 at 17:46
  • 1
    @Gustavo It doesn't really matter if the code works on your computer, or Bob's computer. In fact, it doesn't matter if it works on the majority of the computers, or every single computer in the universe. If the standard says the compiler is allowed to add any sort of padding or alignment, then you shouldn't take the chance. The fact that there might be the chance that a computer is created where this code does not work as expected means it is unsafe code. Commented Aug 10, 2016 at 18:15

2 Answers 2

3

You don't need to write code that works on every system. What you should do is write code that has predictable behavior. Either it works as designed or if your assumptions don't hold, a static assertion aborts the compilation.

The second memcpy line fails to do that. It assumes that offsetof(struct padded_t, len2) + 2 == offsetof(struct padded_t, array2). An assumption which will often hold, but is utterly stupid.

Why not just write

foo.len2 = strlen(str);
memcpy (foo.array2, str, foo.len2);
//possibly, foo.array2[foo.len2] = '\0';

Code is readable. No unneeded variables. No unnecessary casts. No unpredictable behavior. The original doesn't look like code you will learn a lot from, the clutter doesn't look like something I expect someone proficient in C to write.

To answer a comment of yours, making them packed is the wrong fix. Because it will misalign members and will just open another can of worms.

Also keep wary of custom fixed-size typedefs. I once had the fun of debugging a typedef char int8; on a system that had chars unsigned

Sign up to request clarification or add additional context in comments.

6 Comments

The real code is not that simple. I simplified the example to get to the point, but I just don't have no means to do what you suggested. I could show a code similar to the real one, but there is no reason for it, really. The point is already addressed by the question. Anyways, thank you for the answer.
I can only judge by the code you posted and that's just messy.
@Gustavo You don't want to show the code because "there is no reason for it," but this code does not work for you. Clearly it would be useful for us to know your real constraints. Otherwise how do you suggest we provide an answer which fits your needs? :-)
@iRove: There is no reason for it because the question is the same. I cannot post the company code here, as you are surely aware. It would take a lot of time for me to translate the code and it would not add meaning to the question. Anyways, I appreciate the effort to help me.
@iRove: The code doesn't work for me simply because it references the name of the member variable. Maybe I should have pointed in the question that this is exactly what they didn't want when they implemented the code.
|
2

But is this code safe?

As I understand it, it will be safe as long as the uint16 len variables are immediately followed by int8 array[] variables regardless of the other struct members.

It is not safe in the sense that compilers are allowed free rein to insert any amount of padding between or after structure members, so you cannot be confident that &ptr[2] points to the first byte of foo.array2. Provided that uint16 is indeed two bytes wide, however, (which is in no way guaranteed by the language) you can be confident that if size is less than 25 then the

memcpy ((void*)&ptr[2], (void*)str, size);

will modify neither any bytes of foo.len2 nor the last byte of foo.array2. Since foo was earlier filled with zeroes, this will leave foo.array2 as a properly terminated C string. It is thus safe to print it with printf(). On the other hand, C does not guarantee that the result of doing so will be the same as the result of printing str.

Will it only add paddings before a uint16 when the size before it is odd?

This is at the discretion of the compiler. It might be influenced by pragmas, command-line options, configuration options, language extensions (though none of these are in use in the example), target architecture, or anything else that the compiler wants to use to make such decisions.

Is this use correct?

As far as I can tell, the program is conforming, if that's what you mean.

And more importantly, is it safe?

The program's output is not predictable from its code alone, so in that sense, no, it is not safe.

Comments

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.