15

I'm configuring PostgreSQL db for the Bitbucket Server on Windows. In the official guide it says that:

The database must be configured to use the UTF-8 character set.

It doesn't strictly say that you have to set collation to UTF-8, but for other atlassian procucts it's recommended so I assume that's the same case for Bitbucket Server. Exmaple from Confluence documentation:

  • Character encoding must be set to utf8 encoding.
  • Collation must also be set to utf8. Other collations, such as "C", are known to cause issues with Confluence.

This is what I have now, the problem is that it sets the collation to English_United States.1252:

CREATE DATABASE test
WITH OWNER "postgres"
ENCODING 'UTF8'
LC_COLLATE = 'american_usa'
LC_CTYPE = 'american_usa'
TEMPLATE template0;

Is setting collation to UTF-8 actually necessary and if yes, how can I do it?

2 Answers 2

27

Assuming that you are trying to create a PostgreSQL database with US locale sort order and character classification with UTF-8 encoding on Windows, following is a modification to the code example posted in the original question that may be used to achieve that result.

CREATE DATABASE "example_db"
WITH OWNER "postgres"
ENCODING 'UTF8'
LC_COLLATE = 'en-US'
LC_CTYPE = 'en-US'
TEMPLATE template0;

One liner format for terminal copy / paste:

CREATE DATABASE "example_db" WITH OWNER "postgres" ENCODING 'UTF8' LC_COLLATE = 'en-US' LC_CTYPE = 'en-US' TEMPLATE template0;

For anyone trying to create a similar database in a Linux environment such as Ubuntu on Windows Subsystem for Linux, you can do the following (depending on the specific environment, you may need to use 'en_US.UTF8' as the locale instead):

CREATE DATABASE "example_db"
WITH OWNER "postgres"
ENCODING 'UTF8'
LC_COLLATE = 'en_US.UTF-8'
LC_CTYPE = 'en_US.UTF-8'
TEMPLATE template0;

One liner format for terminal copy / paste:

CREATE DATABASE "example_db" WITH OWNER "postgres" ENCODING 'UTF8' LC_COLLATE = 'en_US.UTF-8' LC_CTYPE = 'en_US.UTF-8' TEMPLATE template0;
Sign up to request clarification or add additional context in comments.

3 Comments

en_US.UTF-8 does not exist on Windows as locale and then results in an error.
You are right @spankmaster79, see edit for creating the db on Windows without Windows Subsystem for Linux.
For Turkish locale: CREATE DATABASE "example_db" WITH OWNER=postgres ENCODING 'UTF-8' LC_COLLATE 'tr_TR.UTF-8' LC_CTYPE 'tr_TR.UTF-8' TEMPLATE template0;
12

There is no UTF8 collation. UTF8 is a way to encode characters as numbers, a so-called encoding. Collations define how characters (and composites) are ordered.

While you have to pick a collation that matches the database encoding with PostgreSQL on UNIX, that is not required on Windows. Maybe the documentation you are reading is targeted at UNIX.

You should ask the people who wrote the software to tell you what collation to use.

Comments

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.