7

I'm writing a PHP script to import data into a MYSQL database from a Microsoft SQL Server 2008 database.

The MSSQL Server is set with a collation of "SQL_Latin1_General_CP1_CI_AS" and the data in question is being stored in a column of the type "nchar".

My PHP web pages use

<meta http-equiv="content-type" content="text/html; charset=utf-8">

to indicate that they should be displayed with UTF-8 Character encoding.

I'm pulling the data from the MSSQL database using the sqlsrv PHP extension.

$sql = 'SELECT * FROM [tArticle] WHERE [ID] = 6429';
$stmt = &sqlsrv_query($dbHandler, $sql);

while ($row = sqlsrv_fetch_object($stmt)) {
  // examples of what I've tried simply to display the data
  echo $row->Text1;
  echo utf8_encode($row->Text1);
  echo iconv("ISO-8859-1", "UTF-8", $row->Text1);
  echo iconv("ISO-8859-1", "UTF-8//TRANSLIT", $row->Text1);
}

Forget about inserting the data into the MYSQL database for now. I can't get the string to display properly in my PHP page. From the examples in my listing:

echo $row->Text1

is rendered by my browser as an obviously invalid character: "Lucy�s"

all of the examples following that one are rendered as blanks: "Lucys"

It looks like a character set mismatch problem to me but how can I get this data to display properly from the MS SQL database (without changing my web-page encoding)? If I can figure that out I can probably work out the storing it in the MYSQL database part.

1
  • I haven't worked with sqlsrv but you may need to set its connection encoding upon connecting to the database. The equivalent of running SET NAMES utf8 under mysql after connecting. Commented Jan 11, 2011 at 23:25

2 Answers 2

14

If the strings in the source database are encoded in UTF-8, you should use utf8_decode, not utf8_encode.

But they're probably encoded in some Latin or "Western" Windows code page. So I would try iconv("CP1252", "UTF-8", $row->Text1);, for example.

Another alternative is to run a SQL query that explicitly sets a known encoding. For example, according to the Windows Collation Name (Transact-SQL) documentation, this query would use code page 1252 to encode field Text1: SELECT Text1 COLLATE SQL_Latin1_General_CP1_CI_AS FROM ....

Sign up to request clarification or add additional context in comments.

1 Comment

iconv with "CP1252" did the trick which is odd to me since the MS documentation on the "nchar" field claims that it's a unicode field encoded with UCS-2 character set. Thanks for the fix!
7

try this command it's working for me :

$connectionInfo = array( "Database"=>"DBName", "CharacterSet" =>"UTF-8"); 

Comments

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.