4

The Code Below Can Encode A String To Utf-8 :

#!/usr/bin/python
# -*- coding: utf-8 -*-

str = 'ورود'
print(str.encode('utf-8'))

That Prints:

b'\xd9\x88\xd8\xb1\xd9\x88\xd8\xaf'

But I can't Decode This String With This Code :

#!/usr/bin/python
# -*- coding: utf-8 -*-

str = b'\xd9\x88\xd8\xb1\xd9\x88\xd8\xaf'
print(str.decode('utf-8'))

The error is:

Traceback (most recent call last):
  File "C:\test.py", line 5, in <module>
    print(str.decode('utf-8'))
AttributeError: 'str' object has no attribute 'decode'

Please Help Me ...

Edit

From the answers switched to a byte string:

#!/usr/bin/python
# -*- coding: utf-8 -*-

str = b'\xd9\x88\xd8\xb1\xd9\x88\xd8\xaf'
print(str.decode('utf-8'))

Now the error is:

Traceback (most recent call last):
  File "C:\test.py", line 5, in <module>
    print(str.decode('utf-8'))
  File "C:\Python34\lib\encodings\cp437.py", line 19, in encode
    return codecs.charmap_encode(input,self.errors,encoding_map)[0]
UnicodeEncodeError: 'charmap' codec can't encode characters in position 0-3: character maps to <undefined>
7
  • str = b'\xd9\x88\xd8\xb1\xd9\x88\xd8\xaf'; print(str.decode('utf-8')) Commented Oct 31, 2014 at 8:18
  • I have tested it in my interpreter, it worked. Commented Oct 31, 2014 at 8:21
  • it works for me. check if your python is installed Commented Oct 31, 2014 at 8:22
  • Which version of python? python3 and python2 different way treat with unicode chars. Commented Oct 31, 2014 at 8:28
  • You should precise which version of python you use, as python2 and python3 have a really different comportment with string managment Commented Oct 31, 2014 at 8:29

3 Answers 3

5

It looks like you're using Python 3.X. You .encode() Unicode strings (u'xxx' or 'xxx'). You .decode() byte strings b'xxxx'.

#!/usr/bin/python
# -*- coding: utf-8 -*-

s = b'\xd9\x88\xd8\xb1\xd9\x88\xd8\xaf'
#   ^
#   Need a 'b'
#
print(s.decode('utf-8'))

Note your terminal may not be able to display the Unicode string. Mine Windows console doesn't:

Python 3.3.5 (v3.3.5:62cf4e77f785, Mar  9 2014, 10:35:05) [MSC v.1600 64 bit (AMD64)] on win32
Type "help", "copyright", "credits" or "license" for more information.
>>> s = b'\xd9\x88\xd8\xb1\xd9\x88\xd8\xaf'
>>> #   ^
... #   Need a 'b'
... #
... print(s.decode('utf-8'))
Traceback (most recent call last):
  File "<stdin>", line 4, in <module>
  File "D:\dev\Python33x64\lib\encodings\cp437.py", line 19, in encode
    return codecs.charmap_encode(input,self.errors,encoding_map)[0]
UnicodeEncodeError: 'charmap' codec can't encode characters in position 0-3: character maps to <undefined>

But it does do the decode. '\uxxxx' represents a Unicode code point.

>>> s.decode('utf-8')
'\u0648\u0631\u0648\u062f'

My PythonWin IDE supports UTF-8 and can display the characters:

>>> s = b'\xd9\x88\xd8\xb1\xd9\x88\xd8\xaf'
>>> print(s.decode('utf-8'))
ورود

You can also write the data to a file and display it in an editor that supports UTF-8, like Notepad. since your original string is already UTF-8, just write it to a file directly as bytes. 'wb' opens the file in binary mode and the bytes are written as is:

>>> with open('out.txt','wb') as f:
...     f.write(s)

If you have a Unicode string, you can write it as UTF-8 with:

>>> with open('out.txt','w',encoding='utf8') as f:
...     f.write(u)  # assuming "u" is already a decoded Unicode string.

P.S. str is a built-in type. Don't use it for variable names.

Python 2.x works differently. 'xxxx' is a byte string and u'xxxx' is a Unicode string, but you still .encode() the Unicode string and .decode() the byte string.

Sign up to request clarification or add additional context in comments.

7 Comments

What does not work mean? Show an error message. Most likely, your terminal cannot display the characters you are trying to print. Are you getting a UnicodeEncodeError? That is the terminal complaining that it doesn't support the characters you are printing.
thats the error : File "D:\Python34\lib\encodings\cp1252.py", line 19, in encode return codecs.charmap_encode(input,self.errors,encoding_table)[0] UnicodeEncodeError: 'charmap' codec can't encode characters in position 0-3: cha racter maps to <undefined>
Whatever IDE you are using is mapping to code page 1252 (US English Windows). It can't display those foreign characters. Use an IDE that supports UTF-8. See my updated example.
yes your code can decode but can't print ... How i can save this decoded string to a file with python ?
It can print if you use a UTF-8 IDE, or as you suggest you can write it to a file and open it in Notepad, which supports UTF-8.
|
1

Python has a first class unicode type that you can use in place of the plain bytestring str type. It’s easy, once you accept the need to explicitly convert between a bytestring and a Unicode string:

>>> persian_enter = unicode('\xd9\x88\xd8\xb1\xd9\x88\xd8\xaf', 'utf8')
>>> print persian_enter
ورود

Python 2 had two global functions to coerce objects into strings: unicode() to coerce them into Unicode strings, and str() to coerce them into non-Unicode strings. Python 3 has only one string type, Unicode strings, so the str() function is all you need. (The unicode() function no longer exists.)

read more about reading and writing unicode data

3 Comments

whats your python version ?
Python 2 had two global functions to coerce objects into strings: unicode() to coerce them into Unicode strings, and str() to coerce them into non-Unicode strings. Python 3 has only one string type, Unicode strings, so the str() function is all you need. (The unicode() function no longer exists.)
yep, so , alongside the good answer by @Mark i suggest read more about unicode in python3.x in that link i added to my answer !
0

Use following code:

str = b'\xd9\x88\xd8\xb1\xd9\x88\xd8\xaf'
print(str.decode('utf-8'))

2 Comments

his original code works, think he doesnt have python installed properly. (on mac it does)
no this code not work... i runned this on windows 7 and python 3.4

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.