1

I have had issues with my MySQL and using utf-8 and I ended up using HTML characters, but I am starting to regret that very much. I now have 4 tables with a lot of foreign characters saved as HTML.

I have been able to rewrite programming and setup MySQL to process utf-8 properly, but what would be the best way to convert the strings to utf-8?

INSERT INTO `languages` (`id`, `title`, `native`, `alias`, `status`, `weight`, `updated`, `created`) VALUES
(1, 'English', 'English', 'en', 1, 1, '2009-11-02 21:37:38', '2009-11-02 20:52:00'),
(2, 'Dutch', 'Nederlands', 'nl', 1, NULL, '0000-00-00 00:00:00', '0000-00-00 00:00:00'),
(8, 'French', 'Français', 'fr', 1, NULL, '0000-00-00 00:00:00', '0000-00-00 00:00:00'),
(3, 'Spanish', 'Español', 'es', 1, NULL, '0000-00-00 00:00:00', '0000-00-00 00:00:00'),
(4, 'Italian', 'Italiano', 'it', 1, NULL, '0000-00-00 00:00:00', '0000-00-00 00:00:00'),
(6, 'German', 'Deutsch', 'de', 1, NULL, '0000-00-00 00:00:00', '0000-00-00 00:00:00'),
(7, 'Portuguese', 'Português', 'pt', 1, NULL, '0000-00-00 00:00:00', '0000-00-00 00:00:00'),
(11, 'Swedish', 'Svenska', 'sv', 1, NULL, '0000-00-00 00:00:00', '0000-00-00 00:00:00'),
(9, 'Polish', 'Polski', 'pl', 1, NULL, '0000-00-00 00:00:00', '0000-00-00 00:00:00'),
(12, 'Russian', 'Русский', 'ru', 1, NULL, '0000-00-00 00:00:00', '0000-00-00 00:00:00'),
(13, 'Afrikaans', 'Afrika', 'af', 1, NULL, '0000-00-00 00:00:00', '0000-00-00 00:00:00'),
(14, 'English', 'English', 'en', 1, NULL, '0000-00-00 00:00:00', '0000-00-00 00:00:00'),
(15, 'Albanian', 'Shqip', 'sq', 1, NULL, '0000-00-00 00:00:00', '0000-00-00 00:00:00'),
(16, 'Arabic', 'العربية', 'ar', 1, NULL, '0000-00-00 00:00:00', '0000-00-00 00:00:00'),
(17, 'Farsi', 'الفارسية', 'fa', 1, NULL, '0000-00-00 00:00:00', '0000-00-00 00:00:00'),
(18, 'Chinese (traditional)', '中文(繁體)', 'cht', 1, NULL, '0000-00-00 00:00:00', '0000-00-00 00:00:00'),
(19, 'Japanese', '日本', 'ja', 1, NULL, '0000-00-00 00:00:00', '0000-00-00 00:00:00'),
(20, 'Latin', 'Latina', 'la', 1, NULL, '0000-00-00 00:00:00', '0000-00-00 00:00:00'),
(21, 'Chinese (simplified)', '中文(简体)', 'chs', 1, NULL, '0000-00-00 00:00:00', '0000-00-00 00:00:00'),
(22, 'Turkish', 'Türkçe', 'tr', 1, NULL, '0000-00-00 00:00:00', '0000-00-00 00:00:00'),
(23, 'Catalan', 'Català', 'ca', 1, NULL, '0000-00-00 00:00:00', '0000-00-00 00:00:00'),
(24, 'Hindi', 'हिन्दी', 'hi', 1, NULL, '0000-00-00 00:00:00', '0000-00-00 00:00:00'),
(25, 'Hungarian', 'Magyar', 'hu', 1, NULL, '0000-00-00 00:00:00', '0000-00-00 00:00:00');   

Above is an example of SQL data.

1 Answer 1

2

Round trip the data through PHP. Do a select, grab the relevant fields and run them through htmlentities() to convert back into actual characters, then stuff the data back into the database.

MySQL itself doesn't have any entity encoding/decoding support, so doing the round trip is the quickest/easiest fix.

Sign up to request clarification or add additional context in comments.

3 Comments

Or within MySQL, but maybe not that convenient for the OP: forums.mysql.com/read.php?98,246527
I tried both htmlentities($input) and html_entity_decode($input), but both fail to decode to utf-8.
php.net/htmlentities there's a 3rd parameter for the function to specify charsets.

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.