1

How can I get data from a UTF-8-encoded MySQL database without getting the UnicodeDecodeError? I'm making a website using Python and HTML templates. Here's the code I used to get stuff from the database, which seemed to work fine before I switched the database's encoding to UTF-8:

@app.route("/songs")
def content_database_song():
  c = connect_db()
  c.execute("
  SELECT * FROM Tracks
  JOIN Artists USING (ArtistID)
  JOIN Albums USING (AlbumID)
  JOIN Songs USING (SongID)
  ORDER BY UPPER(SoName), UPPER(AlTitle)
  ")
  songslist = []
  rows = c.fetchall()
  for row in rows:
    songslist.append(row)
  return render_template("/song-index.html", songslist = songslist)

Here's the complete traceback:

UnicodeDecodeError
UnicodeDecodeError: 'ascii' codec can't decode byte 0x96 in position 10: ordinal not in range(128)

Traceback (most recent call last)

File "/Library/Python/2.7/site-packages/Flask-0.7.2-py2.7.egg/flask/app.py", line 1306, in __call__
return self.wsgi_app(environ, start_response)
File "/Library/Python/2.7/site-packages/Flask-0.7.2-py2.7.egg/flask/app.py", line 1294, in wsgi_app
response = self.make_response(self.handle_exception(e))
File "/Library/Python/2.7/site-packages/Flask-0.7.2-py2.7.egg/flask/app.py", line 1292, in wsgi_app
response = self.full_dispatch_request()
File "/Library/Python/2.7/site-packages/Flask-0.7.2-py2.7.egg/flask/app.py", line 1062, in full_dispatch_request
rv = self.handle_user_exception(e)
File "/Library/Python/2.7/site-packages/Flask-0.7.2-py2.7.egg/flask/app.py", line 1060, in full_dispatch_request
rv = self.dispatch_request()
File "/Library/Python/2.7/site-packages/Flask-0.7.2-py2.7.egg/flask/app.py", line 1047, in dispatch_request
return self.view_functions[rule.endpoint](**req.view_args)
File "/Users/samuelbradshaw/Sites/praises/index.py", line 59, in content_database_song
return render_template("/song-index.html", songslist = songslist)
File "/Library/Python/2.7/site-packages/Flask-0.7.2-py2.7.egg/flask/templating.py", line 121, in render_template
context, ctx.app)
File "/Library/Python/2.7/site-packages/Flask-0.7.2-py2.7.egg/flask/templating.py", line 105, in _render
rv = template.render(context)
File "/Library/Python/2.7/site-packages/Jinja2-2.6-py2.7.egg/jinja2/environment.py", line 894, in render
return self.environment.handle_exception(exc_info, True)
File "/Users/samuelbradshaw/Sites/praises/templates/song-index.html", line 1, in top-level template code
{% extends "database-nav.html" %}
File "/Users/samuelbradshaw/Sites/praises/templates/database-nav.html", line 1, in top-level template code
{% extends "layout.html" %}
File "/Users/samuelbradshaw/Sites/praises/templates/layout.html", line 26, in top-level template code
{% block content %}{% endblock %}
File "/Users/samuelbradshaw/Sites/praises/templates/database-nav.html", line 13, in block "content"
{% block subcontent %}
File "/Users/samuelbradshaw/Sites/praises/templates/song-index.html", line 47, in block "subcontent"
<strong>Related Scriptures:</strong> {% if song.SoRelatedScriptures != "" %}{{song.SoRelatedScriptures}}{% else %}None{% endif %}<br>
File "/Library/Python/2.7/site-packages/Jinja2-2.6-py2.7.egg/jinja2/_markupsafe/_native.py", line 21, in escape
return Markup(unicode(s)
UnicodeDecodeError: 'ascii' codec can't decode byte 0x96 in position 10: ordinal not in range(128)
6
  • 1
    Post the whole traceback, please. That will at least give us some idea where you're getting that error. Commented Apr 22, 2012 at 2:49
  • My guess is that you should switch back from UTF-8. Look here: stackoverflow.com/questions/7873556/… Commented Apr 22, 2012 at 2:50
  • 3
    If the first byte that guff.decode('ascii') complains about is 0x96, then guff is not encoded in UTF-8 -- 0x96 is NOT a valid UTF-8 start byte. I'd suggest inserting print repr(row) inside that for loop so that we can see exactly what you've got, instead of guessing. What was the database's encoding before you switched it to UTF-8? Did you reload all your text data after the switch? Commented Apr 22, 2012 at 2:57
  • It was latin1 before I switched it to UTF-8. I had to switch it because it wouldn't let me put in certain punctuation marks (like the curly single quote and dashes). Commented Apr 22, 2012 at 5:44
  • You should not only switch but convert the database to UTF-8. I usually did backup, export to sql, create a new in UTF-8, import, rename both new and old, verify some time, drop the old. Commented Apr 22, 2012 at 11:04

1 Answer 1

1

I just needed to tweak one part of the code – it was in the connect_db() method referenced in the code snippet posted above. I changed this:

def connect_db():
  global conn
  conn = mdb.connect(dbinfo.server, dbinfo.username, dbinfo.password, dbinfo.database)
  return conn.cursor(mdb.cursors.DictCursor)

to this:

def connect_db():
  global conn
  conn = mdb.connect(dbinfo.server, dbinfo.username, dbinfo.password, dbinfo.database, charset='utf8', use_unicode=True)
  return conn.cursor(mdb.cursors.DictCursor)

Note the charset='utf8', use_unicode=True when connecting. That's all I had to change after switching my database to Unicode! :)

Sign up to request clarification or add additional context in comments.

Comments

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.