MySQL Multilingual Encoding | Error Code: 1366. Incorrect string value: '\xCE\x09DIS'

Question

I am trying to set up a database to store string data that is in multiple languages and includes Chinese letters among many others.

Steps I have taken so far:

I have created a schema which uses utf8mb4 character set and utf8mb4_unicode_ci collation.
I have created a table which includes CHARACTER SET utf8mb4 COLLATE utf8mb4_unicode_ci; at the end of the CREATE statement.
I am attempting to LOAD DATA INFILE from a CSV file with CHARACTER SET utf8mb4 specified in the LOAD statement.

However, I am receiving an error Error Code: 1366. Incorrect string value: '\xCE\x09DIS' for column 'company_name' at row 43630.

What is the encoding of the string data?

snakecharmerb
– snakecharmerb

2019-02-26 19:15:06 +00:00
Commented Feb 26, 2019 at 19:15 — snakecharmerb
– snakecharmerb, Commented Feb 26, 2019 at 19:15

Rick James · Accepted Answer · 2019-03-06 17:02:45Z

1

Did it successfully parse 43629 rows? Then croak on that row? It may actually be garbage in the file.

Do you know what that company name should be? What does the rest of the line say?

Do you have another example? Remove that one line and run the LOAD again.

CE can be interpreted by any 1-byte charset, but not necessarily in a meaningful way.

09 is the "tab" character in virtually all charsets; is it reasonable to have a tab in a company name??

answered Mar 6, 2019 at 17:02

Rick James

144k15 gold badges144 silver badges255 bronze badges

Sign up to request clarification or add additional context in comments.

2 Comments

mLstudent33 Over a year ago

What do I do about a beta symbol as in beta-carotene? stackoverflow.com/questions/64687739/…

Rick James Over a year ago

@webNoob13 CEB2 is the utf-8 encoding for the lowercase Greek "beta". CE09 does not make sense; it is not a correct utf-8 encoding and 09 is "tab" in most encodings. Check my Python tips: mysql.rjweb.org/doc.php/charcoll#python

Collectives™ on Stack Overflow

MySQL Multilingual Encoding | Error Code: 1366. Incorrect string value: '\xCE\x09DIS'

1 Answer 1

2 Comments

Your Answer

Hot Network Questions

Collectives™ on Stack Overflow

1 Answer 1

2 Comments

Your Answer

Sign up or log in

Post as a guest

Related