7

I am writing a module that is supposed to work in both Python 2 and 3 and I need to define a binary string.

Usually this would be something like data = b'abc' but this code code fails on Python 2.5 with invalid syntax.

How can I write the above code in a way that will work in all versions of Python 2.5+

Note: this has to be binary (it can contain any kind of characters, 0xFF), this is very important.

4
  • Binary string? Do you mean a bytes object? Commented Oct 13, 2011 at 13:50
  • 8
    The b"abc" syntax and the bytes() constructor were added in Python 2.6. Commented Oct 13, 2011 at 13:53
  • Yes, I was referring to bytes. Commented Oct 13, 2011 at 14:13
  • When googling for python 2 and python 3 in various ways of googling for this, both the six library, and my book, which has essentially similar working solutions for this, will appear on the first page of the search results. Yet, nobody seems to know either of them exists. How can we fix that? Spread the word! Commented Oct 13, 2011 at 20:29

3 Answers 3

6

I would recommend the following:

from six import b

That requires the six module, of course. If you don't want that, here's another version:

import sys
if sys.version < '3':
    def b(x):
        return x
else:
    import codecs
    def b(x):
        return codecs.latin_1_encode(x)[0]

More info.

These solutions (essentially the same) work, are clean, as fast as you are going to get, and can support all 256 byte values (which none of the other solutions here can).

Sign up to request clarification or add additional context in comments.

Comments

2

If the string only has ASCII characters, call encode. This will give you a str in Python 2 (just like b'abc'), and a bytes in Python 3:

'abc'.encode('ascii')

If not, rather than putting binary data in the source, create a data file, open it with 'rb' and read from it.

7 Comments

As you suspected I do have several very small binary blocks, so using files for storing them is not an option. And yes they have non-ascii values.
So, what do the strings actually look like? If they're human-readable strings, decode them with the proper encoding. If not, then use base64.
Create a file and read from it? Complicated solution for a simple problem. Sorry, -1.
(And using ascii is limiting without reason, use latin1 instead).
@LennartRegebro: That wouldn't work in Python 2; try '\xff'.encode('latin1').
|
-3

You could store the data base64-encoded.

First step would be to transform into base64:

>>> import base64
>>> base64.b64encode(b"\x80\xFF")
b'gP8='

This is to be done once, and using the b or not depends on the version of Python you use for it.

In the second step, you put this byte string into a program without the b. Then it is ensured that it works in py2 and py3.

import base64
x = 'gP8='
base64.b64decode(x.encode("latin1"))

gives you a str '\x80\xff' in 2.6 (should work in 2.5 as well) and a b'\x80\xff'in 3.x.

Alternatively to the two steps above, you can do the same with hex data, you can do

import binascii
x = '80FF'
binascii.unhexlify(x) # `bytes()` in 3.x, `str()` in 2.x

8 Comments

Oops, the code is going to be quite cryptic. Cant we find a solution that will work with hex.
Have you tried the code in Python3 ? binascii.unhexlify(x) gives TypeError: 'str' does not support the buffer interface
I don't understand what the base64 part is supposed to do. You can remove it and it will still work.
@sorin: strange... here it works fine in Python 3.1 (r31:73572, Jul 5 2010, 13:15:03). Maybe x.encode("latin1") works better here as well...
@Lennart Regebro It is supposed to be an alternative, as hex was preferred. b'\x80\xff' gets encoded to 'gP8=' in base64 and to '80FF' in hex.
|

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.