4

I have a table called "doctors" and a field called "fullname" which will store names with accents. What I need to do is an "accent insensitive + case insensitive" search, something like:

SELECT * 
FROM doctors
WHERE unaccent_t(fullname) ~* 'unaccented_and_lowercase_string';

where the value to search will come unaccented+lowercase and unaccent_t is a function defined as:

CREATE FUNCTION unaccent_t(text, lowercase boolean DEFAULT false)
RETURNS text AS
$BODY$
SELECT CASE
  WHEN $2 THEN unaccent('unaccent', lower(trim($1)))
  ELSE unaccent('unaccent', trim($1))
END;
$BODY$ LANGUAGE sql IMMUTABLE SET search_path = public, pg_temp;

(I already installed 'unaccent' extension).

So, I went ahead and created the index for "fullname" field:

CREATE INDEX doctors_fullname ON doctors (unaccent_t(fullname) text_pattern_ops);

(I also tried with varchar_pattern_ops and also no specifying ops at all)

In the doctors table, I have around 15K rows.

The query works and I get the expected results, but when I add the explain analyze to the query, I don't see that the index is used:

Seq Scan on doctors  (cost=0.00..4201.76 rows=5 width=395) (actual time=0.282..182.025 rows=15000 loops=1)
  Filter: (unaccent_t((fullname)::text, false) ~* 'garcia'::text)
  Rows Removed by Filter: 1
Planning time: 0.207 ms
Execution time: 183.387 ms

I also tried removing the optional parameter from unaccent_t but I got the same results.

In a scenario like this, how should I define the index so it gets used in a query like the one above?

2
  • This technique can work with = or a left-anchored LIKE, but ~* is an operator for matching a regular expression. Commented Oct 9, 2015 at 19:41
  • So just try SELECT * WHERE name like '%garcia' that should use the index. But this wont SELECT * WHERE name like 'garcia%' Commented Oct 9, 2015 at 20:11

1 Answer 1

4

Btree indexes are usable to speed up operations only when the pattern is left anchored.

Starting from PostgreSQL 9.3 you can speed up generic regular expression searches using a GIN or GiST index with the operator classes provided by the pg_trgm contrib module.

You can read more about it on the PostgreSQL manual at http://www.postgresql.org/docs/9.4/static/pgtrgm.html#AEN163078

Sign up to request clarification or add additional context in comments.

5 Comments

Thanks, this sounds good. I'll create then a gin/gist index for this type of search. Cheers.
Trying to create a gin index with _text_ops tells me that this operator does not accept text data. Without specifiyng any options it tells me that there is no default operator class. Specifying gin_trgm_ops creates the index and it works but I won't be doing 'trigram' full text search in this field. Is there another operator class I can use for this gin index? Or using gin_trgm_ops is fine even if I won't perform any full-text search?
Also, the explain analyze now tells me that the index is being used, but the execution time is basically the same without using an index. It is like the index is not improving the execution time of the query. What might be wrong?
The GIN or GIST _trgm_ops indexes can be used to speed up any regexp or LIKE search. If the resulting execution time is similar to the one without the index it could be because your data set is small or because you are selecting a big part of your your data anyway.
mnencia, that makes perfect sense. My test data falls into what you've mentioned. The good thing is that now I know that the index is being used. So, in my query scenario this gin index will do the work. Thanks again.

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.