1

I'm having problems trying to insert strings containing UTF-8 encoded Chinese characters and punctuations into a SQL Server 2008 table (default installation) from my Delphi 7 application using Zeosdb native SQL Server library.

I remembered in the past I had problems inserting UTF8 string into SQL Server even using PHP and other methods so I believe that this problem is not unique to Zeosdb.

It doesn't happen all the time, some UTF8 encoded strings can get inserted successfully but some not. I can't figure out what is it in the string that caused the failure.

Table schema:

CREATE TABLE [dbo].[incominglog](
    [number] [varchar](50) NULL,
    [keyword] [varchar](1000) NULL,
    [message] [varchar](1000) NULL,
    [messagepart1] [varchar](1000) NULL,
    [datetime] [varchar](50) NULL,
    [recipient] [varchar](50) NULL
) ON [PRIMARY]

SQL statement template:

INSERT INTO INCOMINGLOG ([Number], [Keyword], [Message], [MessagePart1], [Datetime], [Recipient]) 
VALUES('{N}', '{KEYWORD}', '{M}', '{M1}', '{TIMESTAMP}', '{NAME}')

The parameter {KEYWORD}, {M} and {M1} can contain UTF8 string.

For example, the following statement will return an error:

Incorrect syntax near 'é¢'. Unclosed quotation mark after the character string '全力克æœå››ç§å±é™©','2013-06-19 17:07:28','')'.

INSERT INTO INCOMINGLOG ([Number], [Keyword], [Message], [MessagePart1], [Datetime], [Recipient]) 
VALUES('+6590621005', '题', '题 [全力克æœå››ç§å±é™© åšå†³æ‰«é™¤ä½œé£Žä¹‹å¼Š]', '[全力克æœå››ç§å±é™©','2013-06-19 17:07:28', '')

Note: Please ignore the actual characters as the utf8 encoding is lost after copy and paste.

I've also tried using NVARCHAR instead of VARCHAR:

CREATE TABLE [dbo].[incominglog](
    [number] [varchar](50) NULL,
    [keyword] [nvarchar](max) NULL,
    [message] [nvarchar](max) NULL,
    [messagepart1] [nvarchar](max) NULL,
    [datetime] [varchar](50) NULL,
    [recipient] [varchar](50) NULL
) ON [PRIMARY]

And also tried amending the SQL statement into:

INSERT INTO INCOMINGLOG ([Number],[Keyword],[Message],[MessagePart1],[Datetime],[Recipient]) VALUES('{N}',N'{KEYWORD}',N'{M}',N'{M1}','{TIMESTAMP}','{NAME}')

They don't work either. I would appreciate any pointer. Thanks.

EDITED: As indicated by marc_s below, the N prefix must be outside the single quotes. It is correct in my actual test, the initial statement is a typo, which I've corrected.

The test with the N prefix also returned an error:

Incorrect syntax near '原标é¢'. Unclosed quotation mark after the character string '全力克æœ?å››ç§?å?±é™©','2013-06-19 21:22:08','')'.

The SQL statement:

INSERT INTO INCOMINGLOG ([Number],[Keyword],[Message],[MessagePart1],[Datetime],[Recipient]) VALUES('+6590621005',N'原标题',N'原标题 [全力克�四��险 �决扫除作风之弊]',N'[全力克�四��险','2013-06-19','')

. .

REPLY TO gbn's Answer: I've tried using parameterized SQL but still encountering "Unclosed quotation mark after the character string" error.

For the new test, I used a simplified SQL statement:

INSERT INTO INCOMINGLOG ([Keyword],[Message]) VALUES(:KEYWORD,:M)

The error returned for the above statement:

Incorrect syntax near '原标é¢'. Unclosed quotation mark after the character string '')'.

For info, the values of KEYWORD and M are:

KEYWORD:原标题

M:原标题 [

. . .

Further tests on 20th June Parametarized SQL query don't work so I tried a different approach by trying to isolate the character that caused the error. After trial and error, I managed to identify the problematic character.

The following character produces an error: 题

SQL Statement: INSERT INTO INCOMINGLOG ([Keyword]) VALUES('题')

Interestingly, note that the string in the return error tax contains a "?" character which didn't exist in the original statement.

Error: Unclosed quotation mark after the character string '�)'. Incorrect syntax near '�)'.

If I were to place some latin characters immediately after the culprit character, there will be no error. For example, INSERT INTO INCOMINGLOG ([Keyword]) VALUES('题Ok') works ok. Note: It doesn't work with all characters.

5
  • Does the UTF-8 string which you use as replacement for {KEYWORD} contain a quotation mark? The error message seems to indicate this. I recommend to use a parametrized query where such errors can not happen. Commented Jun 19, 2013 at 10:53
  • 2
    Parametrized queries will also prevent SQL injection. Commented Jun 19, 2013 at 11:24
  • hi mjn, I've doubled up all quotes. I suspect that quotes might not be the only illegal characters in UTF8 string. Commented Jun 19, 2013 at 11:44
  • Your SQL INSERT INTO INCOMINGLOG ([Keyword]) VALUES('题') works for me in SSMS Commented Jun 20, 2013 at 6:39
  • @gbn I'm not sure but I think it's not the same when you copy to SSMS, perhaps the encoding changes. The only way to test is using code - I've created PHP code that you can download from here. There are 2 PHP files, one with the statement that works and another which fails. By the way, this article says that MSSQL doesn't support UTF-8 - article link. Previously, before using UTF-8, I converted the UTF-8 to widestring, but they appear as question marks when viewed from SSMS. Commented Jun 20, 2013 at 15:06

1 Answer 1

3

There are ' characters in the UTF-8 which abnormally terminate the SQL.

Classic SQL injection.

Use proper parametrisation, not string concatenation basically.

Edit, after Question updates...

Without the Delphi code, I don't think we can help you
All SQL side code works. For example, this works in SSMS

DECLARE @t TABLE ([Keyword] nvarchar(100) COLLATE Chinese_PRC_CI_AS);
INSERT INTO @t ([Keyword]) VALUES('题');
INSERT INTO @t ([Keyword]) VALUES(N'题');
SELECT * FROM @t T;

Something is missing to help us fic this

Also see

Sign up to request clarification or add additional context in comments.

4 Comments

Thanks gbn, I've tried using parameterized SQL, but still the same error, please check my comment in the original question.
I like point out that my SQL Server db is using Chinese_PRC_CI_AS collation. I've no issues inserting the same data in GBK encoding manually using SQL management studio - but my software inserts using UTF8.
@Joshua: Can you add the Delphi code please. This could be a client error, not a SQL error as such
I've created another question for this, from where there's a link which you can download a test program I made - click here. To reproduce the problem, uncomment "//inserted as utf8" and remove the "ok" in the editbox.

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.