4

EDIT: The wrong type of num2 has been corrected.

Hello,

I have some character arrays of known size which contains raw integer data read from a binary file.

The size of all these arrays have the size of a integer.

I would like to ask whether the following operation is safe and accurate in ALL normal situation, assuming that the endianness of the raw data and the computer running this code agrees.

char arr1[4] = { ... };
char arr2[2] = { ... };

uint32_t num1 = *static_cast<uint32_t*>(arr1); /* OR num1 = *(uint32_t*)arr1 in C */
uint16_t num2 = *static_cast<uint16_t*>(arr2); /* OR num2 = *(uint32_t*)arr2 in C */

Thank you!

4
  • 3
    Since it's a binary file, you'd be better off reading the raw integers into an array of ints. Commented Dec 7, 2010 at 19:19
  • 1
    Don't ever declare more than one variable at a time. It can lead to subtle bugs. securecoding.cert.org/confluence/display/seccode/… Commented Dec 7, 2010 at 19:23
  • Directly casting character memory to an integer only works on CPUs with byte alignment, not word alignment. If you're only/always running on an Intel x86, that won't be a problem. Commented Dec 7, 2010 at 19:36
  • @chrisaycock Well... actually I was... (BTW I don't see there is any problem of declaring >1 variable at a time... I think it is just a personal preference.) Commented Dec 7, 2010 at 19:40

6 Answers 6

5

You should use a union.

union charint32 {
    char arr1[4];
    uint32_t num;
};

This will simplify storage and casting for you.

Sign up to request clarification or add additional context in comments.

2 Comments

As long as the byte order is the same, this will work, and won't have datatype alignment problems.
+1. This will work. It's not compliant to the C++ standard, but then again, this is probably one of the most-violated rules.
3

It is technically safe, but there are a few things I would consider:

  • Add compile-time asserts to verify the sizes. Are you SURE that your char array equals sizeof(your_int_type)? Your num2 is a great example of why this is important - your typo would cause undefined behavior.
  • Consider the alignment. Are you sure that your char array is on a 4-byte boundary (assuming your int is 4 bytes)? PowerPC for example will crash if you try to read an int from an unaligned pointer.

8 Comments

The type of the corresponding integer is determined in my program with some template metaprogramming code, so the correct type is guaranteed.
Is there any way to verify that the char array is REALLY in a 4-byte boundary?
Get its address and check its alignment.
@Cameron: Thanks! Btw, you'll also want to use aligned pointers on Intel CPUs - it won't crash, but it will be A LOT slower since the CPU will internally need to fetch two words and paste them together.
Will !(arr % 4) do? Or I've oversimplified the problem?
|
1

This should be safe:

char arr1[4] = { ... };

uint32_t num1;

memcpy(&num1, arr1, sizeof num1);

But why is arr2 only 2 bytes big? Is that a typo?

1 Comment

@Dibling: As long as you're in control of where the code is deployed, you do.
0

A safer approach would be to use a macro (e.g. MAKEDWORD) to put the bytes in their proper order.

Comments

0

If you are sure the arrays are properly aligned, then there shouldn't be a problem (given the endianness).

In the code, however, I don't know what you're doing with arr2, since it is 16 bits, and you are reading a 32 bit quantity from it.

Comments

0

Yes, that should work fine (under your assumption of endianness), since the representation of these bytes in memory is the same regardless of whether it's interpreted as an array of bytes or an integer.

Really all you're doing is changing the type, not the data.

Comments

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.