I am using PostgreSQL to power a C# desktop application. When I use the PgAdmin query analyzer to update a text column with a special character (like the copyrights trademarks) it works pefectly:
update table1 set column1='value with special character ©' where column2=1
When I use this same query from my C# application, it throws an error:
invalid byte sequence for encoding
After researching this issue, I understand that .NET strings use the UTF-16 Unicode encoding.
Consider:
string sourcetext = "value with special character ©";
// Convert a string to utf-8 bytes.
byte[] utf8Bytes = System.Text.Encoding.UTF8.GetBytes(sourcetext);
// Convert utf-8 bytes to a string.
string desttext = System.Text.Encoding.UTF8.GetString(utf8Bytes);
The problem here is both the sourcetext and desttext are encoded as UTF-16 strings. When I pass desttext, I still get the exception.
I've also tried the following without success:
Encoder.GetString, BitConverter.GetString
Edit: I even tried this and doesn't help:
unsafe
{
String utfeightstring = null;
string sourcetext = "value with special character ©";
Console.WriteLine(sourcetext);
// Convert a string to utf-8 bytes.
sbyte[] utf8Chars = (sbyte[]) (Array) System.Text.Encoding.UTF8.GetBytes(sourcetext);
UTF8Encoding encoding = new UTF8Encoding(true, true);
// Instruct the Garbage Collector not to move the memory
fixed (sbyte* pUtf8Chars = utf8Chars)
{
utfeightstring = new String(pUtf8Chars, 0, utf8Chars.Length, encoding);
}
Console.WriteLine("The UTF8 String is " + utfeightstring);
}
Is there a datatype in .NET that supports storing UTF-8 encoded string? Are there alternative ways to handle this situation?