two questions: 1. why does
In [21]:
....: for root, dir, file in os.walk(spath):
....: print(root)
print the whole tree but
In [6]: for dirs in os.walk(spath):
...: print(dirs)
chokes on this unicode error?
UnicodeEncodeError: 'charmap' codec can't encode character '\u2122' in position 1477: character maps to <undefined>
[NOTE: this is the TM symbol]
- I looked at these answers
What's the deal with Python 3.4, Unicode, different languages and Windows?
https://github.com/Drekin/win-unicode-console
https://docs.python.org/3/search.html?q=IncrementalDecoder&check_keywords=yes&area=default
and tried these variations
----> 1 print(dirs, encoding='utf-8')
TypeError: 'encoding' is an invalid keyword argument for this function
In [11]: >>> u'\u2122'.encode('ascii', 'ignore')
Out[11]: b''
print(dirs).encode(‘utf=8’)
all to no effect.
This was done with python 3.4.3 and visual studio code 1.6.1 on Windows 10. The default settings in Visual Studio Code include:
// The default character set encoding to use when reading and writing files. "files.encoding": "utf8",
python 3.4.3 visual studio code 1.6.1 ipython 3.0.0
UPDATE EDIT I tried this again in the Sublime Text REPL, running a script. Here's what I got:
# -*- coding: utf-8 -*-
import os
spath = 'C:/Users/Semantic/Documents/Align'
with open('os_walk4_align.txt', 'w') as f:
for path, dirs, filenames in os.walk(spath):
print(path, dirs, filenames, file=f)
Traceback (most recent call last):
File "listdir_test1.py", line 8, in <module>
print(path, dirs, filenames, file=f)
File "C:\Python34\lib\encodings\cp1252.py", line 19, in encode
return codecs.charmap_encode(input,self.errors,encoding_table)[0]
UnicodeEncodeError: 'charmap' codec can't encode character '\u2605' in position 300: character maps to <undefined>
This code is only 217 characters long, so where does ‘position 300’ come from?
#coding:utf8) has nothing to do with the output encoding. As you can see from your errorcp1252is the output encoding and doesn't support the characters being printed to the terminal. The easiest way around this is to write to a file with UTF-8 encoding insteading of printing to a display, or use an Python IDE that supports UTF-8 output. I'm not familiar with Sublime Text, but it probably has a way to adjust the output encoding as well.