UnicodeDecodeError with Hex Value

Question

When I run

u''.startswith('x\x9c')

I end up with an exception

UnicodeDecodeError: 'ascii' codec can't decode byte 0x9c in position 1: ordinal not in range(128)

Why does 'x\x9c' get decoded as an ascii character as opposed to a unicode character as I have run it on the unicode string u''?

Tanu · Accepted Answer · 2016-05-10 17:08:00Z

1

This is because python can't decode 'x\x9c' as its non-ascii character. Try this:

import unidecode
u''.startswith(unidecode.unidecode_expect_nonascii('x\x9c'))

Output: returns False As now unicode string 'x\x9c' is now represented in ASCII format by unidecode libraray function.

Also, this is happening because you tried to mix unicode and byte string. i.e if you need to check a.startswith(b) than both should be unicode or byte str. If this is not followed, you get Unicode decode error.

Hope this helps !

edited May 10, 2016 at 17:08

answered May 10, 2016 at 17:01

Tanu

1,57312 silver badges21 bronze badges

Sign up to request clarification or add additional context in comments.

1 Comment

Andy Over a year ago

But why is it trying to convert x\x9c to ascii as opposed to unicode?

Collectives™ on Stack Overflow

UnicodeDecodeError with Hex Value

1 Answer 1

1 Comment

Your Answer

Hot Network Questions

Collectives™ on Stack Overflow

1 Answer 1

1 Comment

Your Answer

Sign up or log in

Post as a guest

Related