Python pyodbc Unicode issue

Question

I have a string variable res which I have derived from a pyodbc cursor as shown in the bottom. The table test has a single row with data ä whose unicode codepoint is u'\xe4'.

The Result I get is

>>> res,type(res)
('\xe4', <type 'str'>)

Whereas the result I should have got is.

>>> res,type(res)
(u'\xe4', <type 'unicode'>)

I tried adding charset as utf-8 to my pyodbc connect string as shown below. The result was now correctly set as a unicode but the codepoint was for someother string ꓃ which could be due to a possible bug in the pyodbc driver.

conn = pyodbc.connect(DSN='datbase;charset=utf8',ansi=True,autocommit=True)
>>> res,type(res)
(u'\ua4c3', <type 'unicode'>)

Actual code

import pyodbc
pyodbc.pooling=False
conn = pyodbc.connect(DSN='datbase',ansi=True,autocommit=True)
cursor = conn.cursor()
cur = cursor.execute('SELECT col1 from test')
res = cur.fetchall()[0][0]
print(res)

Additional details Database: Teradata pyodbc version: 2.7

So How do I now either

1) cast ('\xe4', <type 'str'>) to (u'\xe4', <type 'unicode'>) (is it possible to do this without unintentional side-effects?)

2) resolve the pyodbc/unixodbc issue

Ferrarezi · Accepted Answer · 2018-02-27 21:03:04Z

15

For Python 3, try this:

After conn = pyodbc.connect(DSN='datbase',ansi=True,autocommit=True)

Place this:

conn.setdecoding(pyodbc.SQL_CHAR, encoding='utf8') conn.setdecoding(pyodbc.SQL_WCHAR, encoding='utf8') conn.setencoding(encoding='utf8')

or

conn.setdecoding(pyodbc.SQL_CHAR, encoding='iso-8859-1') conn.setdecoding(pyodbc.SQL_WCHAR, encoding='iso-8859-1') conn.setencoding(encoding='iso-8859-1')

etc...

Python 2:

cnxn.setdecoding(pyodbc.SQL_CHAR, encoding='utf-8') cnxn.setdecoding(pyodbc.SQL_WCHAR, encoding='utf-8') cnxn.setencoding(str, encoding='utf-8') cnxn.setencoding(unicode, encoding='utf-8')

etc...

cnxn.setdecoding(pyodbc.SQL_CHAR, encoding='encode-foo-bar') cnxn.setdecoding(pyodbc.SQL_WCHAR, encoding='encode-foo-bar') cnxn.setencoding(str, encoding='encode-foo-bar') cnxn.setencoding(unicode, encoding='encode-foo-bar')

answered Feb 27, 2018 at 21:03

Ferrarezi

85010 silver badges12 bronze badges

Sign up to request clarification or add additional context in comments.

3 Comments

Gord Thompson Over a year ago

Thanks for posting an update relevant to current versions of pyodbc (4.x and up). More details are available at the pyodbc Wiki.

Captain Jack Sparrow Over a year ago

Thank you! This was helpful for me using Python3 and MySQL ODBC 8.0 ANSI Driver on MacOS.

Alex Sham Over a year ago

I am also a Mac user and faed the same problem, this solution helped me.

Bryan · Accepted Answer · 2015-04-07 17:28:25Z

3

This is something I think is best handled with Python, instead of fiddling with pyodbc.connect arguments and driver-specific connection string attributes.

'\xe4' is a Latin-1 encoded string representing the unicode ä character.

To explicitly decode the pyodbc result in Python 2.7:

>>> res = '\xe4'
>>> res.decode('latin1'), type(res.decode('latin1'))
(u'\xe4', <type 'unicode'>)
>>> print res.decode('latin1')
ä

Python 3.x does this for you (the str type includes unicode characters):

>>> res = '\xe4'
>>> res, type(res)
('ä', <class 'str'>)
>>> print(res)
ä

answered Apr 7, 2015 at 17:28

Bryan

17.7k7 gold badges59 silver badges81 bronze badges

2 Comments

rogue-one Over a year ago

Thanks it works as expected. however our requirement is to support Asian characters too, so I am going with JayDeBeApi which uses JDBC via jpype.

Jari Turkia Over a year ago

This approach is discouraged. "Fiddling" with connection parameters is the native way and should be used as described in @Ferrarezi's answer.

Collectives™ on Stack Overflow

Python pyodbc Unicode issue

2 Answers 2

3 Comments

2 Comments

Your Answer

Hot Network Questions

Collectives™ on Stack Overflow

2 Answers 2

3 Comments

2 Comments

Your Answer

Sign up or log in

Post as a guest

Related