7

I am using Python to convert some files to a binary format, but I've run into an odd snare.

Problem

Code

import struct
s = struct.Struct('Bffffff')
print s.size

Result

28

Obviously the expected size would be 25, but it appears to be interpreting the first byte (B) as a 4-byte integer of some kind. It will also write out a 4-byte integer instead of a byte.

Work-around

A work-around exists, namely separating the B out into a separate struct, like so:

Code

import struct
s1 = struct.Struct('B')
s2 = struct.Struct('ffffff')
print s1.size + s2.size

Result

25

Is there any explanation for this behavior?

3
  • 2
    The reason the struct is bigger than expected is that padding bytes are added to it so that the floats in the struct are properly aligned. See en.wikipedia.org/wiki/Data_structure_alignment Commented Feb 8, 2015 at 12:03
  • 2
    An alternate work-around is to place the byte field at the end of the struct: print struct.Struct('ffffffB').size prints 25. Commented Feb 8, 2015 at 12:07
  • The first letter is padded to 4 bytes. Commented Feb 8, 2015 at 12:08

2 Answers 2

5

From the docs

Padding is only automatically added between successive structure members. No padding is added at the beginning or the end of the encoded struct.

If you test

>>> import struct
>>> s1 = struct.Struct('B')
>>> print s1.size
1
>>> s1 = struct.Struct('f')
>>> print s1.size
4

So when you add it is 25 ... But the other way round, B is 1 and the rest are 4 so it will be padded to make it 4 thus the answer is 28 Consider this example

>>> s1 = struct.Struct('Bf')
>>> print s1.size
8

Again here B is 1 and padded 3 and f is 4 so finally it comes up to 8 which is as expected.

As mentioned here to override it you will have to use non-native methods

>>> s1 = struct.Struct('!Bf')
>>> print s1.size
5

No padding is added when using non-native size and alignment, e.g. with ‘<’, ‘>’, ‘=’, and ‘!’.

Sign up to request clarification or add additional context in comments.

Comments

3

Unless you specify any character for byte order, alignment, struct use native byte order, alignment(@); which cause padding.

By explicitly specifying byte order, you can get what you want:

>>> struct.Struct('!Bffffff').size  # network byte order
25
>>> struct.Struct('=Bffffff').size  # native byte order, no alignment.
25
>>> struct.Struct('>Bffffff').size  # big endian
25
>>> struct.Struct('<Bffffff').size  # little endian
25
>>> struct.Struct('@Bffffff').size  # native byte order, alignment. (+ native size)
28

2 Comments

'@' means also native size, which may give you a bad surprise when you copy your python code to a different computer (32 vs 64 bit)
@Thinkeye, Thank you for the information. I updated the code comment to include that information.

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.