1

I am sure that the problem I am experiencing isn't directly related to the OS or version, but instead to some kind of setup. In a python django app I am writing JSON to a file which can contain characters from other languages like 我, ל, and は. At no point in time in the flow of data am I changing the encoding as far as I am aware.

During local development, this was not a problem.

    with open(self._json_path, 'w') as f:
        json.dump(test_dict, f, indent=2, ensure_ascii=False)
    answer, wordid, question = self._unpack_dict(test_dict)

Once I deployed to the live web server, I began getting:

'ascii' codec can't encode character '\u90fd' in position 1: ordinal not in range(128)

I know for a fact that the data in test_dict is encoded properly. As soon as the json.dump occurs, it errors. If I open the file that was created, it fails at the very first non-latin character I put into it.

I've been through this post, but couldn't sort out the problem. Adding , encoding='utf8' causes the output of the above code to create a file but put nothing in it. Again, I know for a fact that test_dict has data as the data is displaying on the web page properly. **Answer to blank file: ** in troubleshooting I switched from dump to dumps which caused the files to be generated but not filled. **Answer to main problem: ** encoding='utf8' is not correct, it is encoding='utf-8'

I've also tried rebuilding the virtual environment.

On the server here are some results:

echo $LANG
en_US.UTF-8
python -c "import sys; print(sys.stdout.encoding)"
UTF-8
LANG=en_US.UTF-8
LANGUAGE=
LC_CTYPE="en_US.UTF-8"
LC_NUMERIC="en_US.UTF-8"
LC_TIME="en_US.UTF-8"
LC_COLLATE="en_US.UTF-8"
LC_MONETARY="en_US.UTF-8"
LC_MESSAGES="en_US.UTF-8"
LC_PAPER="en_US.UTF-8"
LC_NAME="en_US.UTF-8"
LC_ADDRESS="en_US.UTF-8"
LC_TELEPHONE="en_US.UTF-8"
LC_MEASUREMENT="en_US.UTF-8"
LC_IDENTIFICATION="en_US.UTF-8"
LC_ALL=

Local environment:

echo $LANG
en_US.UTF-8
python -c "import sys; print(sys.stdout.encoding)"
UTF-8
LANG=en_US.UTF-8
LANGUAGE=en_US
LC_CTYPE="en_US.UTF-8"
LC_NUMERIC="en_US.UTF-8"
LC_TIME="en_US.UTF-8"
LC_COLLATE="en_US.UTF-8"
LC_MONETARY="en_US.UTF-8"
LC_MESSAGES="en_US.UTF-8"
LC_PAPER="en_US.UTF-8"
LC_NAME="en_US.UTF-8"
LC_ADDRESS="en_US.UTF-8"
LC_TELEPHONE="en_US.UTF-8"
LC_MEASUREMENT="en_US.UTF-8"
LC_IDENTIFICATION="en_US.UTF-8"
LC_ALL=

The two most notable difference are:

Server: Python 3.5

Local environment: Python 3.7.2

apache2/error.log

[Fri Feb 15 09:49:15.899753 2019] [wsgi:error] [pid 2489] GETTING NEW TEST
[Fri Feb 15 09:49:15.957897 2019] [wsgi:error] [pid 2489] Saving the test dictionary
[Fri Feb 15 09:49:16.006470 2019] [wsgi:error] [pid 2489] Internal Server Error: /test/addohm/
[Fri Feb 15 09:49:16.006506 2019] [wsgi:error] [pid 2489] Traceback (most recent call last):
[Fri Feb 15 09:49:16.006509 2019] [wsgi:error] [pid 2489]   File "/path/to/website/venv/lib/python3.5/site-packages/django/core/handlers/exception.py", line 34, in in$
[Fri Feb 15 09:49:16.006511 2019] [wsgi:error] [pid 2489]     response = get_response(request)
[Fri Feb 15 09:49:16.006514 2019] [wsgi:error] [pid 2489]   File "/path/to/website/venv/lib/python3.5/site-packages/django/core/handlers/base.py", line 126, in _get_r$
[Fri Feb 15 09:49:16.006517 2019] [wsgi:error] [pid 2489]     response = self.process_exception_by_middleware(e, request)
[Fri Feb 15 09:49:16.006520 2019] [wsgi:error] [pid 2489]   File "/path/to/website/venv/lib/python3.5/site-packages/django/core/handlers/base.py", line 124, in _get_r$
[Fri Feb 15 09:49:16.006522 2019] [wsgi:error] [pid 2489]     response = wrapped_callback(request, *callback_args, **callback_kwargs)
[Fri Feb 15 09:49:16.006525 2019] [wsgi:error] [pid 2489]   File "/path/to/website/duotool/main/views.py", line 125, in test
[Fri Feb 15 09:49:16.006531 2019] [wsgi:error] [pid 2489]     form = TestForm(wordsdict)
[Fri Feb 15 09:49:16.006533 2019] [wsgi:error] [pid 2489]   File "/path/to/website/duotool/main/forms.py", line 21, in __init__
[Fri Feb 15 09:49:16.006536 2019] [wsgi:error] [pid 2489]     json.dumps(test_dict, f, indent=2, ensure_ascii=False)
[Fri Feb 15 09:49:16.006538 2019] [wsgi:error] [pid 2489]   File "/usr/lib/python3.5/json/__init__.py", line 179, in dump
[Fri Feb 15 09:49:16.006541 2019] [wsgi:error] [pid 2489]     fp.write(chunk)
[Fri Feb 15 09:49:16.006545 2019] [wsgi:error] [pid 2489] UnicodeEncodeError: 'ascii' codec can't encode character '\\u7ea6' in position 1: ordinal not in range(128)
[Fri Feb 15 09:49:16.006550 2019] [wsgi:error] [pid 2489]
[Fri Feb 15 09:51:36.924025 2019] [wsgi:error] [pid 2627] [client 124.9.54.252:55825] Timeout when reading response headers from daemon process 'duotool.addohm.net': /var/www/django/du$
[Fri Feb 15 09:51:43.283083 2019] [wsgi:error] [pid 2631] [client 124.9.54.252:55971] Truncated or oversized response headers received from daemon process 'duotool.addohm.net': /var/ww$
[Fri Feb 15 09:51:43.283324 2019] [wsgi:error] [pid 2630] [client 124.9.54.252:55888] Truncated or oversized response headers received from daemon process 'duotool.addohm.net': /var/ww$
[Fri Feb 15 09:53:18.572055 2019] [wsgi:error] [pid 2489] GETTING NEW TEST
[Fri Feb 15 09:53:18.631474 2019] [wsgi:error] [pid 2489] Saving the test dictionary
[Fri Feb 15 09:53:18.675315 2019] [wsgi:error] [pid 2489] Internal Server Error: /test/addohm/
[Fri Feb 15 09:53:18.675335 2019] [wsgi:error] [pid 2489] Traceback (most recent call last):
[Fri Feb 15 09:53:18.675338 2019] [wsgi:error] [pid 2489]   File "/path/to/website/venv/lib/python3.5/site-packages/django/core/handlers/exception.py", line 34, in in$
[Fri Feb 15 09:53:18.675341 2019] [wsgi:error] [pid 2489]     response = get_response(request)
[Fri Feb 15 09:53:18.675344 2019] [wsgi:error] [pid 2489]   File "/path/to/website/venv/lib/python3.5/site-packages/django/core/handlers/base.py", line 126, in _get_r$
[Fri Feb 15 09:53:18.675347 2019] [wsgi:error] [pid 2489]     response = self.process_exception_by_middleware(e, request)
[Fri Feb 15 09:53:18.675349 2019] [wsgi:error] [pid 2489]   File "/path/to/website/venv/lib/python3.5/site-packages/django/core/handlers/base.py", line 124, in _get_r$
[Fri Feb 15 09:53:18.675352 2019] [wsgi:error] [pid 2489]     response = wrapped_callback(request, *callback_args, **callback_kwargs)
[Fri Feb 15 09:53:18.675354 2019] [wsgi:error] [pid 2489]   File "/path/to/website/duotool/main/views.py", line 125, in test
[Fri Feb 15 09:53:18.675357 2019] [wsgi:error] [pid 2489]     form = TestForm(wordsdict)
[Fri Feb 15 09:53:18.675359 2019] [wsgi:error] [pid 2489]   File "/path/to/website/duotool/main/forms.py", line 21, in __init__
[Fri Feb 15 09:53:18.675362 2019] [wsgi:error] [pid 2489]     json.dumps(test_dict, f, indent=2, ensure_ascii=False)
[Fri Feb 15 09:53:18.675365 2019] [wsgi:error] [pid 2489]   File "/usr/lib/python3.5/json/__init__.py", line 179, in dump
[Fri Feb 15 09:53:18.675367 2019] [wsgi:error] [pid 2489]     fp.write(chunk)
[Fri Feb 15 09:53:18.675371 2019] [wsgi:error] [pid 2489] UnicodeEncodeError: 'ascii' codec can't encode character '\\u90fd' in position 1: ordinal not in range(128)
[Fri Feb 15 09:53:18.675376 2019] [wsgi:error] [pid 2489]
[Fri Feb 15 09:53:19.498167 2019] [wsgi:error] [pid 2652] /path/to/website/duotool

I can't see any differences between the two versions that would cause this. Where do I go from here?

4
  • First thing to check, is your locale is setup correctly. On your server, type locale. This should return a load of environment declarations. Check it doesn't throw errors. Commented Feb 15, 2019 at 10:42
  • 1
    Show us some sample data to encode and the full error traceback. Commented Feb 15, 2019 at 10:44
  • @KlausD. I added locale output information from both sides. Commented Feb 15, 2019 at 10:57
  • @AlastairMcCormack added two of the logged occurances. If I catch the django debug output, I'll post it as well. Commented Feb 15, 2019 at 11:01

1 Answer 1

1

I think you should specify the encoding for the file you are writing to when you open it. This should work:

   file = open(self._json_path, 'w',encoding='utf-8')
   file.write(json.dumps(your_json))
   file.close()
Sign up to request clarification or add additional context in comments.

2 Comments

I didn't do exactly this, but close enough. The post mentioned above showed utf8 as the string to put into the encoding= argument. Changed it to utf-8 and it work. P.S. The reason that the file didn't have any information was because I changed form dump to dumps in troubleshooting.
In Python, utf8 is an alias for utf-8. I think the problem was elsewhere.

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.