3

I have this issue, I don't know if is expected, here is the thing:

I'm trying to load bytes from a file to an structure like this:

struct
{
    char
    char
    char
    char
    unsigned int
}

but the problem is when the unsigned int is filled, it seems like in the reading stream the bytes are swapped, e.x. if the file contains 0x45080000, the unsigned int will have 0x84500000, which is wrong.

This can be "solved" if i change the unsigned int for a BYTE[4], but is not what I want. Here is the code that I use to read from the file:

fopen_s( &mFile, "myFile.ext", "rb" );

if( mFile == NULL ) print( " **** E R R O R ! **** " );
else
{
    if( fread( &myStruct, sizeof( MY_Struct ), 1, myFile) != 1)
    {
        print( " **** E R R O R ! **** " );
        return 0;
    }
}

Is this an expected behavior or what am I doing wrong?

Regards

4
  • 7
    Endianness will bite you in the ass. Commented Jul 26, 2011 at 18:58
  • Great error messages! I'll note this for myself. Commented Jul 26, 2011 at 19:15
  • I dont know if this is a problem of endianness because I'm just reading the bytes as they are in the file stored, one by one, so endianness should not affect, I think Commented Jul 26, 2011 at 19:36
  • lol, i just put error because is an error :P, to be a little generic xD Commented Jul 26, 2011 at 19:37

4 Answers 4

3

As you've discovered, portable serialization can be a pain. Instead of writing and reading the structure, you need to serialize each attribute individually in a normalized format (network byte order is common). Then when you deserialize the bytes come back correctly.

Sign up to request clarification or add additional context in comments.

1 Comment

ups, then i think i will prepare for all the pain =(
1

Either serialize/unserialize each field independently (standards conforming) or...

use a platform specific option:

#pragma pack(push,1)
struct foo {
  // ...
};
#pragma pack(pop)

This aligns all variables in foo to 1 byte alignment, so it wont work for a boolean.

If you intend to be cross-platform you'll have to test the hell out of it for problems of endianess and pragma support.

5 Comments

yeah, it will be cross-platform, but i cannot serialize each field, because if I have a big file, and i need to grab some info of it, or you think the best solution is to read byte by byte?
@user864094 It's hard to answer that without knowing about structure of the file. You probably need to describe what you are trying to do more.
is a file, that stores some info about frames and modules, like a sprite editor, this file have a header with the first 4 bytes indicating a name, one more byte indicating if is a module or animation or frame, and then, the unsigned int that is a size to reading info, but, the problem comes here, because as I said, if the size is 0x45080000, the program itself put the 08 at the beginning, but this not happens when I read byte by byte, only when I change the types
Reading a binary blob into a struct is not standards compliant. There may be some endianess problems as the others have mentioned. If byte by byte is working, then I would go with it. If the subtleties of our solutions don't make any sense then it probably requires more research.
ok, at the moment I will go for the byte by byte reading, as you say, this will need more research, thanx for your answer
1

Just to add a bit of icing to the cake!

You must handle 32 vs. 64 bit issues also.

A very good include file is stdint.h (C99). It defines types with specific size so you get less problems when switching the word width.

Comments

0

you probably need to take into account big/little endian. try s.th. like this:

#define BIG_ENDIAN
//#define LITTLE_ENDIAN
...
#ifdef BIG_ENDIAN

inline void writeMem32(uint8* buf, uint32 val)
{
    buf[0] = static_cast<uint8>(val >> 24);
    buf[1] = static_cast<uint8>(val >> 16);
    buf[2] = static_cast<uint8>(val >>  8);
    buf[3] = static_cast<uint8>(val >>  0);
}

inline uint32 readMem32(const uint8* buf)
{
    return ( ((*(buf+0)) << 24) | ((*(buf+1)) << 16) | ((*(buf+2)) << 8) | (*(buf+3)) );
}

#endif // BIG_ENDIAN

if you are on little endian the byte order is of course different :)

4 Comments

may be this is the problem, if the file is written in little endian, the bytes are not readed in the same order?
There's standard hton/ntoh functions. Don't reinvent the wheel.
@RocketR: I agree. But I found that sometimes it's useful to reinvent just to get a deeper understanding of what is really going on...that's at least what is true for myself :)
@user864094: depends on the architecture of the system where you read the file (this has some examples). In your comment you say first 4 bytes indicating a name...one more byte indicating if is a module or animation or frame, and then, the unsigned int that is a size But your struct only contains 4 bytes and 1 integer...maybe you are missing a byte here. Also: The layout of your struct is absolutely platform-dependent. You have to check how your compiler does the alignment (might turn out that each bytes gets aligned at 32bit boundaries).

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.