you state when you try the character directly within MYSQL it works, only when java puts it there that its incorrect.
Tried getting your code to look for these characters and dumping them to a text file or out to std for a short test to compare the text std output vs what got sent to db ?
also worth storing the db transactions to see what was sent:
as far as mysql config goes ensure you have the tables and mysql itself running in utf-8 mode:
[client]
default-character-set=utf8
# This was formally known as [safe_mysqld]. Both versions are currently parsed.
[mysqld_safe]
default-character-set=utf8
default-collation=utf8_general_ci
character-set-server=utf8
collation-server=utf8_general_ci
init-connect='SET NAMES utf8'
[mysqld]
default-character-set=utf8
default-collation=utf8_general_ci
character-set-server=utf8
collation-server=utf8_general_ci
Ensure above has been put into /etc/mysql/my.cnf
for each DB name you have run below to get it to dump out tables and add an alter line to each table to convert to utf8
select CONCAT("Alter Table `", i.TABLE_NAME, "` CONVERT TO CHARACTER SET utf8;") as MySQLCMD from information_schema.TABLES i where i.TABLE_SCHEMA =
"userbase" INTO OUTFILE '/tmp/userbase.csv' ;
Other things worth trying - specially if its to write in utf-8 on this server:
Linux system environment:
Unix Locale
locale
LANG=en_GB.UTF-8
LC_CTYPE="en_GB.UTF-8"
LC_NUMERIC="en_GB.UTF-8"
LC_TIME="en_GB.UTF-8"
LC_COLLATE="en_GB.UTF-8"
LC_MONETARY="en_GB.UTF-8"
LC_MESSAGES="en_GB.UTF-8"
LC_PAPER="en_GB.UTF-8"
LC_NAME="en_GB.UTF-8"
LC_ADDRESS="en_GB.UTF-8"
LC_TELEPHONE="en_GB.UTF-8"
LC_MEASUREMENT="en_GB.UTF-8"
LC_IDENTIFICATION="en_GB.UTF-8"
LC_ALL=
To fix this
sudo dpkg-reconfigure locales select en_GB.UTF-8
update-locale LANG=en_GB.UTF-8
Re start box for services to pick up utf-8 as a user you will need to
log out totally and back in and check locale before reboot to ensure
its working.
This will now mean you can input japanese on your local ssh (if putty
in the settings utf-8 needs to be selected)
- Tomcat:
add URIEncoding="UTF-8" to
I also added to
<Connector port="8009"......
protocol="AJP/1.3" URIEncoding="UTF-8" />
3.2
In the web.xml for local sites (within WEB-INF) web.xml (unsure if
this is essential)
<web-app>
<filter>
<filter-name>charsetFilter</filter-name>
<filter-class>filters.SetCharacterEncodingFilter</filter-class>
<init-param>
<param-name>encoding</param-name>
<param-value>UTF-8</param-value>
</init-param>
</filter>
then look for mapping and also add:
<!-- Define filter mappings for the defined filters -->
<filter-mapping>
<filter-name>charsetFilter</filter-name>
<url-pattern>/*</url-pattern>
</filter-mapping>
I have come across specific character corruption issues worth opening up saving and viewing udp string in a good utf-8 editor (notepad++ with options to enable utf-8) or kate or something on kde.
also test out the different utf-8 characters the ones that do work and ones that potentially don't work via std out or file on
http://www.fileformat.info/info/unicode/char/search.htm
and ensure the characters are the same
http://www.fileformat.info/info/unicode/char/00ae/index.htm