0

I have following code in Pyhton:

# myFile.csv tend to looks like:
# 'a1',   'ふじさん',   'c1'
# 'a2',   'ふじさん',   'c2'
# 'a3',   'ふじさん',   'c3'

s = u"unicodeText" # unicodeText like, ふじさん بعدة أش  일본富士山Ölkələr
with codecs.open('myFile.csv', 'w+', 'utf-8') as f: # codecs open
    f.write(s.encode('utf-8', 'ignore'))
  1. I was using Vim to edit the code and using Vim to open "myFile.csv";
  2. It can success display unicode text from terminal;
  3. but not able to display unicode text from Excel, nor from browser;
  4. My platform is osx

I don't know if is my configuration problem or actually I code it wrong way, if you any idea, please advise. Deeply appreciate!


change open to codecs.open.
Thanks for point out f.close(), deleted.

6
  • It makes no sense to read/write xls or csv files in the way you are trying to do it. You need to use a specialized module, such as xlrd for xls files, or the csv module for csv files. Commented Apr 27, 2015 at 16:20
  • @ekhumoro you're right about xls, but writing a csv file does not require a special module. Commented Apr 27, 2015 at 17:51
  • @dbliss. I didn't claim that it wasn't possible. But the csv module exists for a good reason, and since it's part of the stdlib, it makes no sense not to use it. Commented Apr 27, 2015 at 17:56
  • 1
    You don't need to call f.close() when the file is opened as the context manager for the with statement. Commented Apr 27, 2015 at 18:00
  • @ekhumoro thx, I changed it. please treat it as .csv file, the problem actually remain the same, unicode display issue. Commented Apr 27, 2015 at 18:23

3 Answers 3

2

Excel (at least on Windows) likes a Unicode BOM at the start of a .csv file even with UTF-8. There is a codec for that, utf-8-sig.

Also, Python 3's normal open is all that is required and no need for f.close() in a with:

#coding:utf8
data = '''\
a1,ふじさん,c1
a2,ふじさん,c2
a3,ふじさん,c3
'''
with open('myFile.csv', 'w', encoding='utf-8-sig') as f:
    f.write(data)
Sign up to request clarification or add additional context in comments.

1 Comment

utf-8-sig works. Thanks for point out. deleted f.close(), thx so much
1

It seems you're trying to open the file in text mode (because you specify an encoding), but then you try to write binary data (because you encode the text before writing it to the file). You need to either open the file as binary and write encoded text, or open it as text and write text.

Furthermore, your attempt to open it as text isn't even working because you're passing utf-8 as the buffering parameter instead of the encoding parameter. See the open() documentation`.

But even if you did all that correctly, this still wouldn't really help you with an Excel file, because those have a complicated binary structure. I recommend you use something like the xlrd to read xls files and Xlswriter to write them.

Here is a simple example that should work for .csv:

with open('file.csv', 'w', encoding='utf-8') as fh:
    fh.write('This >µ< is a unicode GREEK LETTER MU\n')

or alternatively

with open('file.csv', 'wb') as fh:
    fh.write('This >µ< is a unicode GREEK LETTER MU\n'.encode('utf-8'))

1 Comment

Thanks! I think codecs.open actually can pass parameter as 'uft-8', and you are right, io.open should pass parameter as encoding='utf-8'.
1

codecs.open opens a wrapped reader/writer which will do encoding/decoding for you. So you do not need to encode your string for writing. You need to pass the 'ignore' parameter in your open call.

with open('myFile.csv', 'w+', 'utf-8', 'ignore') as f:
    f.write(s)

Note that you do not need to call close as you use a with statement.

Original answer, scratch that:

Third parameter of open is buffering requiring an integer. You should write pass the encoding like this:

with open('myFile.xls', 'w+', encoding='utf-8') as f:

Note that you open the file in text mode. No need to encode the string for writing.

Also your file mode 'w+' is a bit odd. I'm not sure, but I think it will truncate your file. If you want to append to the file you should use 'a' as mode.

3 Comments

The OP seems to be using codecs.open, which has a different signature to the built-in open function.
Ooops. My fault. Didn't notice the comment, only tried to get the code working.
Corrected the answer, but leaving the first try for consistency at the bottom.

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.