Postgresql does not accept \0 in UTF string, while C# does

Question

I have some data that includes a \0 byte in it, and that seems to be valid UTF8 data:

using System;
using System.Text;
                    
public class Program
{
    public static void Main()
    {
        byte[] b = new byte[3];
        b[0] = 65;
        b[1] = 66;
        b[2] = 0;
        
        Console.WriteLine(Encoding.UTF8.GetString(b));
    }
}

That code works fine. But, when trying to update a record in Postgres, it complains about it:

22021: invalid byte sequence for encoding "UTF8": 0x00

The data shouldn't be there, but how can it be that one system accepts it, and another doesn't? I reckon they both implement standards.

Lukasz Szozda · Accepted Answer · 2021-02-24 14:41:29Z

1

From documenation 8.3. Character Types

+-----------------------------------+----------------------------+
|               Name                |        Description         |
+-----------------------------------+----------------------------+
| character varying(n), varchar(n)  | variable-length with limit |
| character(n), char(n)             | fixed-length, blank padded |
| text                              | variable unlimited length  |
+-----------------------------------+----------------------------+
The characters that can be stored in any of these data types are determined by the database character set, which is selected when the database is created. Regardless of the specific character set, the character with code zero (sometimes called NUL) cannot be stored. For more information refer to Section 23.3.

answered Feb 24, 2021 at 14:41

Lukasz Szozda

181k26 gold badges278 silver badges326 bronze badges

Sign up to request clarification or add additional context in comments.

2 Comments

Bart Friederichs Over a year ago

Thanks, the resulting message is a bit confusing though. It looks like the code point is invalid, not that it is not allowed.

Richard Huxton Over a year ago

Also for some background read this: commandprompt.com/blog/…

Collectives™ on Stack Overflow

Postgresql does not accept \0 in UTF string, while C# does

1 Answer 1

2 Comments

Your Answer

Hot Network Questions

Collectives™ on Stack Overflow

1 Answer 1

2 Comments

Your Answer

Sign up or log in

Post as a guest

Related