17

I would like to print a unicode's character code, and not the actual glyph it represents in Python.

For example, if u is a list of unicode characters:

>>> u[0]
u'\u0103'
>>> print u[0]
ă

I would like to output the character code as a raw string: u'\u0103'.

I have tried to just print it to a file, but this doesn't work without encoding it in UTF-8.

>>> w = open('~/foo.txt', 'w')
>>> print>>w, u[0].decode('utf-8')

Traceback (most recent call last):
  File "<pyshell#33>", line 1, in <module>
    print>>w, u[0].decode('utf-8')
  File "/Library/Frameworks/Python.framework/Versions/2.7/lib/python2.7/encodings/utf_8.py", line 16, in decode
    return codecs.utf_8_decode(input, errors, True)
UnicodeEncodeError: 'ascii' codec can't encode character u'\u0103' in position 0: ordinal not in range(128)
>>> print>>w, u[0].encode('utf-8')
>>> w.close()

Encoding it results in the glyph ă being written to the file.

How can I write the character code?

1 Answer 1

22

For printing raw unicode data one only need specify the correct encoding:

>>> s = u'\u0103'
>>> print s.encode('raw_unicode_escape')
\u0103
Sign up to request clarification or add additional context in comments.

Comments

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.