0

I'd like to "equate" two arrays, where one is inside a fixed union (should not be changed). Instead of using memcpy, I'd simply point the head of myUnion.RawBytes to the head of array. But the compiler throws an error for the myUnion.RawBytes = &array[0]; assignmet. Why is this so? Is there any way I can circumvent this problem?

The faulty code below tries to illustrate this.

#include <stdio.h>

typedef union{
    unsigned char  RawBytes[2];
    unsigned short RawWord;
} MyUnion;

int main(){
    MyUnion myUnion;

    char array[2] = {1, 1};
    myUnion.RawBytes = &array[0];

    printf("%d", myUnion.RawWord);

    return 0;
}

Error:

main.c: In function ‘main’:
main.c:12:22: error: assignment to expression with array type
     myUnion.RawBytes = &array[0];
4
  • 2
    You cannot copy array contents using assignment operator, use memcpy,forloop or pointer. Commented Sep 7, 2018 at 11:39
  • Yes, but I don't need a "true" copy. I simply want the two arrays to point to the same location, not to 2 separate blocks of memory. Commented Sep 7, 2018 at 11:41
  • 1
    Arrays do not "point" they are allocated by linker like vairables. Pointers do point. Change to a pointer, point to the first element in the array. Many things you can do wiht arrays will be possible then. Commented Sep 7, 2018 at 11:43
  • thought about that but that changes the size of my union (from 2 to 8), and so RawWord returns a false value. Commented Sep 7, 2018 at 11:52

3 Answers 3

1

The correct way of union punning.

#include <stdio.h>

typedef union{
    unsigned char  RawBytes[2];
    unsigned short RawWord;
} MyUnion;

int main(){
    MyUnion myUnion;

    char array[2] = {1, 1};
    myUnion.RawBytes[0] = array[0];
    myUnion.RawBytes[1] = array[1];

    printf("%d", myUnion.RawWord);

    return 0;
}
Sign up to request clarification or add additional context in comments.

4 Comments

This is the very same thing as memcpy though.
@Lundin there is no other way of union punning without breaking the aliasing rules
That's not entirely clear to me. As proven in comments to another answer, myUnion = (MyUnion*)array; ... *myUnion is just fine and doesn't break strict aliasing. The question is if the same could be said about myUnion = (MyUnion*)array; ... myUnion->RawWord, where the lvalue access is of type unsigned short. Probably not. Then what about myUnion = (MyUnion*)array; ... MyUnion foo = *myUnion; ... foo->RawWord . I believe that would be well-defined.
@P__J__: The Standard imposes no requirements on what happens if code uses an lvlaue of type unsigned short to access an object of type MyUnion. The authors of the Standard likely thought it sufficiently obvious how any quality implementation should behave that there was no need to add condescending language to that effect. On the other hand, the fact that it should be obvious how implementations should behave in some case doesn't mean some compiler writers won't have other ideas. That can happen even when the rationale says how common implementations are expected to behave.
1

I read the question as can one take any array of 2 characters and interpreter its value as an unsigned short without copying, by using this clever union trick, and the answer is no, you can't.

The reason is not that of strict aliasing, but that it can brealk alignment requirements. Almost all platforms have the alignment requirement of at least 2 for unsigned short. Behaviour is undefined if a pointer is being converted to another that doesn't have the fundamental alignment requirement.

Yes, this can crash on x86. Forget about being able to access unaligned objects with machine language - you're programming in C, not in assembly.


The correct way to do this is to use memcpy which will tell the compiler that the access can be unaligned, i.e.

char array[2] = {1, 1};
uint16_t raw_word;
memcpy(&raw_word, array, 2);

Do note that memcpy is a standard library function and the compiler is allowed to generate any kind of machine code for as long as that behaves as if the memcpy function from standard library was called.

6 Comments

I am doing this is Embedded C, with a well-known architecture and without an underlying OS
@davidanderle and you're using a C compiler. There is no language called "Embedded C". Better read the compiler manual thoroughly. Many have been wrong before.
Yes, that is true. I'll dig deeper
Well, in embedded there's a whole lot of 8 and 16 bitters that have no alignment requirements at all, so the code will work just fine there, as far as alignment is concerned.
@Lundin it still comes from the C compiler, not the platform.
|
-1

To solve your purpose you can use below approach.

Note: Below approach doesn't follow strict aliasing rule.

#include <stdio.h>

typedef union{
    unsigned char  RawBytes[2];
    unsigned short RawWord;
} MyUnion;

int main(){
    MyUnion *myUnion;

    unsigned char array[2] = {1, 1};
    myUnion = &array;

    printf("%d", myUnion->RawWord);
    printf("\n%d %d", myUnion->RawBytes[0], myUnion->RawBytes[1]);

    return 0;
}

I strictly recommend you to have array inside union and use memcpy or for loop.

9 Comments

You must use a cast myUnion = (MyUnion*)&array;. And your remark about strict aliasing is not correct: MyUnion is a union type that includes an unsigned char[2] among its members. However, those who aren't sure about how strict aliasing works should definitely not use this method.
This approach will break pointer alignment rules. MyUnion and unsigned char possibly have different alignment requirements. (In particular, the unsigned short RawWord member possibly has different alignment requirements.) The pointers are simply not compatible.
@Lundin actually I am not sure that the standard says that this is OK, when it talks about type punning it talks about storing into an union object, but here an array[2] of unsigned char is typepunned into an union first.
@Lundin perhaps you should write a QA about this :D Indeed it seems that 6.5.7 allows this.
@AnttiHaapala Seems pretty clear? unsigned char array[2] is an object that shall have its stored value accessed only by an lvalue expression such as myUnion->... that has one of the following types: ... an aggregate or union type that includes one of the aforementioned types among its members. Italics being taken from 6.5 in the standard. Or are you saying that the lvalue should only be access by for example *myUnion rather than myUnion->...?
|

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.