2

I want to insert arabic information to the database but always get caracters like this : ابو نص. I use the UTF-8 encoding in my pages and i set my database to utf8_general_ci.

I read many questions similar to this question but I don't find a solution for my case.

this is a solution but with php and i don't know how to do the same thing in java.

The code of insert (by JdbcTemplate)

final String move_insert = "insert into r_movement (PPR,cd_fonc,nom_etabl,ville,delegation,date_debut,date_fin,nbjour,nbmois,nbannees,cina,cinn) "
               + "values (?,?,?,?,?,?,?,?,?,?,?,?)";

       getJdbcTemplate()
       .update(move_insert, new Object[] {move.getPpr(),move.getFonction(),move.getNom_etabl(),move.getVille(),move.getDelegation(),move.getDate_debut(),move.getDate_fin(),c.getNbjours(),c.getNbmois(),c.getNbyears(),move.getCina(),move.getCinn()});

This is my table :

CREATE TABLE `r_movement` (
 `id_move` int(11) NOT NULL AUTO_INCREMENT,
 `PPR` int(11) NOT NULL,
 `cd_fonc` varchar(255) CHARACTER SET utf8 NOT NULL,
 `nom_etabl` varchar(255) CHARACTER SET utf8 NOT NULL,
 `ville` varchar(255) CHARACTER SET utf8 NOT NULL,
 `delegation` varchar(255) CHARACTER SET utf8 NOT NULL,
 `date_debut` date NOT NULL,
 `date_fin` date NOT NULL,
 `nbjour` int(255) NOT NULL,
 `nbmois` int(255) NOT NULL,
 `nbannees` int(255) NOT NULL,
 `CINA` varchar(255) CHARACTER SET utf8 NOT NULL,
 `CINN` varchar(255) CHARACTER SET utf8 NOT NULL,
 PRIMARY KEY (`id_move`)
) ENGINE=InnoDB AUTO_INCREMENT=17 DEFAULT CHARSET=utf8
11
  • 1
    First step: separate out the database access from the web page part. I suggest you write a short console app which just inserts data and then retrieves it. Diagnose the strings by printing out their UTF-16 code units (use charAt and convert each char to an int). Also, please show the code you're using to insert the data. Commented Jun 2, 2013 at 18:57
  • I am also using Arabic character with mysql, I used InnoDb and the default charset is utf8, I have no problem with it.Did you check inside the database if the characters also like ابو Ù†Ø ? Commented Jun 2, 2013 at 19:02
  • my table is InnoBD too but which language do you use for insert data @AzadOmer? Commented Jun 2, 2013 at 19:07
  • @JonSkeet I separate the database access from the web page, i'm using the pattern MVC Commented Jun 2, 2013 at 19:10
  • 1
    My point is that in order to diagnose the problem you should completely separate the two. Work out whether the problem is on the web side or the database side. Commented Jun 2, 2013 at 19:15

3 Answers 3

3

I solved Finnaly The problem the configuration in the file web.xml was missed !

<filter>
    <filter-name>encoding-filter</filter-name>
    <filter-class>org.springframework.web.filter.CharacterEncodingFilter</filter-class>
    <init-param>
      <param-name>encoding</param-name>
      <param-value>utf-8</param-value>
    </init-param>
    <init-param>
      <param-name>forceEncoding</param-name>
      <param-value>true</param-value>
    </init-param>
  </filter>
  <filter-mapping>
    <filter-name>encoding-filter</filter-name>
    <url-pattern>/*</url-pattern>
    <dispatcher>REQUEST</dispatcher>
    <dispatcher>FORWARD</dispatcher>
  </filter-mapping>

I can now insert arabic data to database safely! Thanks

Sign up to request clarification or add additional context in comments.

Comments

2

Try setting character encoding in connection string as explained in docs. e.g.

jdbc:mysql://localhost/some_db?useUnicode=yes&characterEncoding=UTF-8

You also can set that as a server configuration. Look at the doc.

13 Comments

I try it and i get this error : The reference to entity "characterEncoding" must end with the ';' delimiter. ?
I set this configuration in my file spring-datasource.xml
@Souad, you need to escape & characters to make a valid XML. btw see description of these options in this doc: dev.mysql.com/doc/refman/5.7/en/…
Now I cannot logging to my application!! I have another problem whith spring security!
@Souad,perhaps with this encoding your password string looks different... try resetting your password to see if this is the case
|
0

This Answer (belatedly) discusses how to recover the mojibaked text.

ابو نص represents ابو نص. Hex: D8A7D8A8D98820D986D8B5. That's 5 Arabic characters (Dxxx), plus a space (20).

What caused the problem:

  • The bytes you have in the client are correctly encoded in utf8.
  • You connected with latin1, probably by default. (It should have been utf8.)
  • The column in the table was declared CHARACTER SET latin1. (Or possibly it was inherited from the table/database.) (It should have been utf8.)

The fix for the data is a "2-step ALTER".

ALTER TABLE Tbl MODIFY COLUMN col VARBINARY(...) ...;
ALTER TABLE Tbl MODIFY COLUMN col VARCHAR(...) ... CHARACTER SET utf8 ...;

where the lengths are big enough and the other "..." have whatever else (NOT NULL, etc) was already on the column.

By the way, utf8_general_ci is a "collation"; only the "character set", utf8, is relevant to this problem.

Comments

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.