Unicode list to String list Python 2

Question

I have this list:

l = [u'\xf9', u'!']

And I want to convert it in this list:

l2 = ['ù','!']

How can i do it? and Why does l.encode() not work?

What you mean by convert? 'ù' is just a type of representation of your character! do you mean that you want to print it like that? — Kasravnd
– Kasravnd, Commented Apr 13, 2015 at 20:39
[u.encode('u8') for u in l] l[0].encode wont work because the character is outside ascii range (128) — Shashank
– Shashank, Commented Apr 13, 2015 at 20:42
That's what i Did Shashank, but why 'ù' is converted to '\xc3\xb9'? This should have been my question... sorry. — user3477823
– user3477823, Commented Apr 13, 2015 at 20:44

Sylvain Leroux · Accepted Answer · 2015-04-14 08:02:20Z

1

Are you using Python 2 ? If it is the case, you might be fooled by the way Python displays strings.

As you noticed, '\xc3\xb9' is the UTF-8 encoded representation of code point U+00F9 ('ù'). So:

# code point
>>> u'ù'
u'\xf9'

# seems wrong ?
>>> u'ù'.encode('utf-8')
'\xc3\xb9'

# No, not at all (at least on my UTF-8 terminal)
>>> print(u'ù'.encode('utf-8'))
ù

Given your example:

>>> l = [u'\xf9', u'!']
>>> print(l)
[u'\xf9', u'!']
>>> l[0]
u'\xf9'
>>> print(l[0])
ù

>>> l2 = [u.encode('utf-8') for u in l]
>>> l2
['\xc3\xb9', '!']
>>> print(l2)
['\xc3\xb9', '!']
>>> print(l2[0])
ù

I agree all of this is rather inconsistent and source of frustration. That's why string/unicode support was a major rewrite in Python 3. There:

# Python 3
>>> l = [u'\xf9', u'!']
>>> l
['ù', '!']

edited Apr 14, 2015 at 8:02

answered Apr 13, 2015 at 21:10

Sylvain Leroux

52.4k8 gold badges114 silver badges136 bronze badges

Sign up to request clarification or add additional context in comments.

Collectives™ on Stack Overflow

Unicode list to String list Python 2

1 Answer 1

Comments

Your Answer

Hot Network Questions

Collectives™ on Stack Overflow

1 Answer 1

Comments

Your Answer

Sign up or log in

Post as a guest

Related