0

Even tho I tried to specify encoding in python's gzip.open(), it seems to be always using cp1252.py to encode the file's content. My code:

with gzip.open('file.gz', 'rt', 'cp1250') as f:
    content = f.read()

Response:

File "C:\Python34\lib\encodings\cp1252.py", line 23, in decode return codecs.charmap_decode(input,self.errors,decoding_table)[0] UnicodeDecodeError: 'charmap' codec can't decode byte 0x8f in position 52893: character maps to undefined

1 Answer 1

0

Python 3.x

gzip.open is defined as:

gzip.open(filename, mode='rb', compresslevel=9, encoding=None, errors=None, newline=None)

Therefore, gzip.open('file.gz', 'rt', 'cp1250') sends it these arguments: - filename = 'file.gz' - mode = 'rt' - compresslevel = 'cp1250'

This is clearly wrong, because the intention is to use 'cp1250' encoding. The encoding argument can either be sent as the fourth positional argument or as a keyword argument:

gzip.open('file.gz', 'rt', 5, 'cp1250')  # 4th positional argument

gzip.open('file.gz', 'rt', encoding='cp1250') # keyword argument

Python 2.x

Python 2 version of gzip.open does not take the encoding argument and it does not accept text modes, so the decoding has to be done explicitly after reading the data:

with gzip.open('file.gz', 'rb') as f:
    data = f.read()

decoded_data = data.decode('cp1250')
Sign up to request clarification or add additional context in comments.

3 Comments

If you'd look closer on my interpreter's response you would see that I'm using Python 3.4 when your answer refers to Python 2.7
According to gzip.open() specification in the Python3 docs, it takes these arguments: gzip.open(filename, mode='rb', compresslevel=9, encoding=None, errors=None, newline=None), and setting the mode to rt means you're getting the text, not the binary.
Why else would it use cp1252.py for anything when encoding bytes?

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.