62

In several SO posts OP asked for an efficient way to search text columns in a case insensitive manner.

As much as I could understand the most efficient way is to have a database with a case insensitive collation. In my case I am creating the database from scratch, so I have the perfect control on the DB collation. The only problem is that I have no idea how to define it and could not find any example of it.

Please, show me how to create a database with case insensitive collation.

I am using postgresql 9.2.4.

EDIT 1

The CITEXT extension is a good solution. However, it has some limitations, as explained in the documentation. I will certainly use it, if no better way exists.

I would like to emphasize, that I wish ALL the string operations to be case insensitive. Using CITEXT for every TEXT field is one way. However, using a case insensitive collation would be the best, if at all possible.

Now https://stackoverflow.com/users/562459/mike-sherrill-catcall says that PostgreSQL uses whatever collations the underlying system exposes. I do not mind making the OS expose a case insensitive collation. The only problem I have no idea how to do it.

5
  • 3
    PostgreSQL uses whatever collations the underlying operating system exposes. The system table "pg_collation" is populated by initdb. Use select * from pg_collation; to see which collations it found. Commented Sep 15, 2013 at 2:57
  • This does not answer my question. Commented Sep 15, 2013 at 3:12
  • 2
    You can try to use this Postgres extention Commented Sep 15, 2013 at 8:00
  • 1
    @mark: That's why I posted it as a comment, not as an answer. If you run that query, and you find no case-insensitive collations, that's probably your answer. Commented Sep 15, 2013 at 14:03
  • Possible duplicate of stackoverflow.com/q/17422054/157957 and stackoverflow.com/q/1929590/157957 Commented Sep 15, 2013 at 22:14

5 Answers 5

44

A lot has changed since this question. Native support for case-insensitive collation has been added in PostgreSQL v12. This basically deprecates the citext extension, as mentioned in the other answers.

In PostgreSQL v12, one can do:

    CREATE COLLATION case_insensitive (
      provider = icu,
      locale = 'und-u-ks-level2',
      deterministic = false
    );

    CREATE TABLE names(
      first_name text,
      last_name text
    );

    insert into names values
      ('Anton','Egger'),
      ('Berta','egger'),
      ('Conrad','Egger');

    select * from names
      order by
        last_name collate case_insensitive,
        first_name collate case_insensitive;

See https://www.postgresql.org/docs/current/collation.html for more information.

Sign up to request clarification or add additional context in comments.

10 Comments

Note that this depends on the operating system and the ICU version that comes with it.
Note that as of PostgreSQL v12, non deterministic collations DO NOT support LIKE and LIKE
What does "und-u-ks-level2" mean?
@user275801 See the Unicode Collation Settings table for more info on this syntax.
Overview of ICU collation settings by Peter Eisentraut perfectly explains what all the parts of und-u-ks-level2 mean.
|
10

For my purpose the ILIKE keyword did the job.

From the postgres docs:

The key word ILIKE can be used instead of LIKE to make the match case-insensitive according to the active locale. This is not in the SQL standard but is a PostgreSQL extension.

2 Comments

Unless you escape the pattern, this will produce wrong results for strings containing _ or % if you attempt to use it like =.
Problem is, frameworks like TypeORM has no support for ILIKE in their find conditions.
10

There are no case insensitive collations, but there is the citext extension:

http://www.postgresql.org/docs/current/static/citext.html

1 Comment

See my other answer (stackoverflow.com/a/59101567/8870331) for PostgreSQL v12 and beyond.
2

This is not changing collation, but maybe somebody help this type of query, where I was use function lower:

SELECT id, full_name, email FROM nurses WHERE(lower(full_name) LIKE '%bar%' OR lower(email) LIKE '%bar%')

Comments

-4

I believe you need to specify your collation as a command line option to initdb when you create the database cluster. Something like

initdb --lc-collate=en_US.UTF-8 

It also seems that using PostgreSQL 9.3 on Ubuntu and Mac OS X, initdb automatically creates the database cluster using a case-insensitive collation that is default in the current OS locale, in my case, en_US.UTF-8.

Could you be using an older version of PostgreSQL that does not default to the host locale? Or could it be that you are on an operating system that does not provide any case-insensitive collations for PostgreSQL to choose from?

4 Comments

I am using now PostgreSQL 9.3 on Windows 7 and 8. I have no idea whether they provide a case insensitive collation for PostgreSQL. I know that SQL Server can be configured with such a collation.
I can't help with Windows... but it sounds like that's the place to start. Find out what case-insensitive collations Windows provides, and see if you can tell PostgreSQL to use one of them when it creates the cluster.
From my experiments with PostgreSQL 9.6, the --lc-collate=en_US.UTF-8 option does not produce a case-insensitive collation.
This answer is wrong. The collation is not case insensitive.

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.