16

I'm trying to use MySQL Connector/Python from mysql.com with Python 3.

I have tables in UTF-8 coding, and when I fetch the rows, all my chars columns returned like bytearray. This is make some confusion.

How I can fetch directly str?

UPD:

# -*- coding: utf-8 -*-
import mysql.connector
con = mysql.connector.connect( user ="root", db = "vg_site_db", charset = 'utf8' )
cursor = con.cursor()
sql = """select caption from domains
"""
cursor.execute( sql )
row = cursor.fetchone()
while row is not None:
    print( row )
    row = cursor.fetchone()

output:

(bytearray(b'ezsp.ru'),)
(bytearray(b'eazyshop.ru'),)
(bytearray(b'127.0.0.1:8080'),)
(bytearray(b'rmsvet.ru'),)

I want:

('ezsp.ru',)
('eazyshop.ru',)
('127.0.0.1:8080',)
('rmsvet.ru',)

UPD2:

My tables use COLLATE utf8_bin.

3
  • show your python code that reads database content Commented Dec 19, 2014 at 12:24
  • I update post and add the code samples Commented Dec 19, 2014 at 12:45
  • I'm also getting bytearray's from cursor.fetchone() in Python 2.7. Passing them through str(row[0].decode()) turned them into native strings in both Python 2.7 and 3.4. MySQL made a change in Connector version 2. Commented May 21, 2015 at 23:26

6 Answers 6

6

Seems like this happens when you use binary collation, at least the same happened to me. To convert the bytearrays to Unicode strings, you can add a custom converter class:

class MyConverter(mysql.connector.conversion.MySQLConverter):

    def row_to_python(self, row, fields):
        row = super(MyConverter, self).row_to_python(row, fields)

        def to_unicode(col):
            if isinstance(col, bytearray):
                return col.decode('utf-8')
            return col

        return[to_unicode(col) for col in row]

sql = mysql.connector.connect(converter_class=MyConverter, host=...)
Sign up to request clarification or add additional context in comments.

2 Comments

Unfortunately this no longer works, because converters now have to inherit from MySQLConverterBase and that one doesn't have the row_to_python method, unless you copy paste it yourself and ugh.
@Tominator : Sure? It still works for me. Also, MySQLConverter extends MySQLConverterBase and still has the row_to_python method, so I don't immediately see why it shouldn't work.
4

MySQL Connector returns strings (as stored using the CHAR, VARCHAR, and TEXT data types) as bytearrays when respective columns are defined with a binary collation (e.g. utf8_bin). You must call .decode() on values to get Python strings, e.g.:

for row in cursor:
    caption = row[0].decode()

That said, unless you have a specific requirement to use utf8_bin, it's a much better idea to use the utf8mb4 character set with the collation utf8mb4_unicode_ci on the database level. That would solve your problem and allow for full Unicode support. See this and this fore more details.

2 Comments

Wouldn't you rather recomment utf8mb_unicode_cs? As I understand it, bin sorts/selects by exact character code, cs sorts like a regular user would expect, and ci is case insensitive but otherwise like cs.
@lucidbrot There's no utf8mb_unicode_cs collation, but perhaps you can use utf8mb4_0900_as_cs. As for which one is better, it depends. Generally, capital letters are considered to be identical to their corresponding lower case letters for the purposes of alphabetical ordering.
1

Adding mysql-connector-python==8.0.17 to requirements.txt resolved this issue for me.

Comments

0

Although the most voted answer (by @danmichaelo) certainly works, I wanted to provide my version that solves a major "but" that is already pointed out by @Tominator; custom converter classes must now inherit MySQLConverterBase and not MySQLConverter. The reason you do not want to inherit MySQLConverter (even if it inherits MySQLConverterBase as @danmichaelo points out) is that it will invoke the corresponding converter in the MySQLConverter class on each returned value. This will implement a logic you may not want.

To avoid the above, you have two choices. First, you could create a higher-level function that will get the data and alter the rows after they are retrieved.

def get_data_from_db(cursor, sql_query):
    cursor.execute(sql)
    row = cursor.fetchone()
    while row is not None:
        row_to_return = row.decode('utf-8') if isinstance(row, bytearray) else row
        row = cursor.fetchone()
    
    return row

If you still want to use the custom converter class method, then you should inherit the MySQLConverterBase as suggested in the documentation (https://dev.mysql.com/doc/connector-python/en/connector-python-connectargs.html this is valid until mysql-connector-python==8.0.26, see below) and then you could extend the MySQLConverterBase.to_python method.

class MyConverter(mysql.connector.conversion.MySQLConverterBase):
    def to_python(self, vtype, value):
        """Convert MySQL data type to Python"""
        if isinstance(value, bytearray):
            return value.decode('utf-8')

        super().to_python(vtype, value)

P.S. The class MyConverter can be used to implement custom converters by creating functions with names as in the MySQLConverter class (find the class here: https://github.com/mysql/mysql-connector-python/blob/master/lib/mysql/connector/conversion.py). For example, I was looking to convert TINYINT to bool and added a method named MyConverter._TINY_to_python(self, value, desc=None)

-- Update mysql-connector-python==8.0.27 --

In the 8.0.27 version, if you create a converter class that inherits the MySQLConverterBase you will probably get an error saying "expected a bytes-like object, str found". It is unclear to me why this happens but my answer above on creating the custom converter seems to not hold anymore. Instead, one should inherit the MySQLConverter class now:

class MyConverter(mysql.connector.conversion.MySQLConverter):
    def to_python(self, vtype, value):
        """Convert MySQL data type to Python"""
        if isinstance(value, bytearray):
            return value.decode('utf-8')

        super().to_python(vtype, value)

Comments

-1

I don't think that you can get the cursor to return strings. The MySQL Connector Documentation says that they chose to return bytearrays so that they only have to maintain one codebase for both Python2 and Python3:

With the use of “raw” cursors, the returned values is of the bytearray type. This is necessary for having both Python 2 and 3 return the same data.

I addressed this issue using a list comprehension to decode each bytearray in the row:

for row in cursor:
    type_fixed_row = tuple([el.decode('utf-8') if type(el) is bytearray else el for el in row])
    print( type_fixed_row )

1 Comment

raw is not enabled by default, so I find this puzzling. The documentation says "By default, strings coming from MySQL are returned as Python Unicode literals."
-2

An easy way to solve this issue is to make sure that you are retrieving 'strings' from your MySQL table. To do so, you just have to add a CAST in your query as follows:

 # -*- coding: utf-8 -*-
import mysql.connector
con = mysql.connector.connect( user ="root", db = "vg_site_db", charset = 'utf8' )
cursor = con.cursor()
sql = "select CAST(caption as CHAR(50)) from domains"
cursor.execute( sql )
row = cursor.fetchone()
while row is not None:
    print( row )
    row = cursor.fetchone()

This should work for you.

Comments

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.