Convert Unicode char code to char on Python

Question

I have a list of Unicode character codes I need to convert into chars on python 2.7.

U+0021
U+0022
U+0023
.......
U+0024

How to do that?

Mark Tolonen · Accepted Answer · 2018-05-11 04:03:06Z

2

This regular expression will replace all U+nnnn sequences with the corresponding Unicode character:

import re

s = u'''\
U+0021
U+0022
U+0023
.......
U+0024
'''

s = re.sub(ur'U\+([0-9A-F]{4})',lambda m: unichr(int(m.group(1),16)),s)

print(s)

Output:

!
"
#
.......
$

Explanation:

unichr gives the character of a codepoint, e.g. unichr(0x21) == u'!'.
int('0021',16) converts a hexadecimal string to an integer.
lambda(m): expression is an anonymous function that receives the regex match.
It defines a function equivalent to def func(m): return expression but inline.
re.sub matches a pattern and sends each match to a function that returns the replacement. In this case, the pattern is U+hhhh where h is a hexadecimal digit, and the replacement function converts the hexadecimal digit string into a Unicode character.

edited May 11, 2018 at 4:03

answered May 11, 2018 at 3:54

Mark Tolonen

181k26 gold badges183 silver badges279 bronze badges

Sign up to request clarification or add additional context in comments.

1 Comment

Arno Over a year ago

yeah, I thought there should be something simpler( Thanks!

aurel1510 · Accepted Answer · 2022-11-29 16:26:32Z

1

In case anyone using Python 3 and above wonders, how to do this effectively, I'll leave this post here for reference, since I didn't realize the author was asking about Python 2.7...

Just use the built-in python function chr():

char = chr(0x2474)
print(char)

Output:

⑴

Remember that the four digits in Unicode codenames U+WXYZ stand for a hexadecimal number WXYZ, which in python should be written as 0xWXYZ.

edited Nov 29, 2022 at 16:26

answered Nov 29, 2022 at 16:20

aurel1510

112 bronze badges

Comments

Gautam Kumar · Accepted Answer · 2018-05-11 04:13:27Z

0

The code written below will take every Unicode string and will convert into the string.

for I in list:
    print(I.encode('ascii', 'ignore'))

answered May 11, 2018 at 4:13

Gautam Kumar

5754 silver badges10 bronze badges

Comments

Sherpa · Accepted Answer · 2018-05-11 01:23:43Z

-2

a = 'U+aaa'
a.encode('ascii','ignore')
'aaa'

This will convert for unicode to Ascii which i think is what you want.

answered May 11, 2018 at 1:23

Sherpa

936 bronze badges

1 Comment

hcheung Over a year ago

This return a str in python 2 but return a byte in python3

Collectives™ on Stack Overflow

Convert Unicode char code to char on Python

4 Answers 4

1 Comment

Comments

Comments

1 Comment

Your Answer

Hot Network Questions

Collectives™ on Stack Overflow

4 Answers 4

1 Comment

Comments

Comments

1 Comment

Your Answer

Sign up or log in

Post as a guest

Related