1

I'm trying to create the fastest way to search millions (80+ mio) of records in a PostgreSQL (version 9.4), over multiple columns.

I would like to try and use standard PostgreSQL, and not Solr etc.

I'm currently testing Full Text Search followed https://blog.lateral.io/2015/05/full-text-search-in-milliseconds-with-postgresql/.

It works, but I would like some more flexible way to search.

Currently, if I have a column containing ex. "Volvo" and one containing "Blue" I am able to find the record with the search string "volvo blue", but I would like to also find the record using "volvo blu" as if I used LIKE and "%blu%'.

Is that possible with full text search?

3
  • FTS has prefix matching facilities, but in general it is not designed to do that efficiently. FTS is designed around finding lexeme matches (and blu vs. blue is not a match, but f.ex. volvo, volvos and volvo's are). -- If you can upgrade to 9.6, the pg_trgm has a nice new feature in it: word similarity, which might handle your use cases. Commented May 31, 2017 at 12:44
  • 9.6 also added support for "phrase search" (multiple adjacent words) in FTS. Commented May 31, 2017 at 12:45
  • Or, for an alternative solution, you could do the search in 2 steps: 1st, you need to search for each word's typos (pg_trgm is especially good in it). After you found matches, you can offer your end-users the possibility to search for those instead in a 2nd step (similarly f.ex. how google handles when you misspell words). Commented May 31, 2017 at 12:47

1 Answer 1

2

The only option to something like this is by using the pg_trgm contrib module.

This enables you to create a GIN or GiST index that indexes all sequences of three characters, which can be used for a search with the similarity operator %.

Two notes:

  1. Using the % operator may return “false positive” results, so be sure to add a second condition (e.g. with LIKE) that eliminates those.

  2. A trigram search works well with longer search strings, but performs badly with short search strings because of the many false positive results.

If that is not good enough for your purposes, you'll have to resort to an third-party solution.

Sign up to request clarification or add additional context in comments.

2 Comments

Their examples primarely shows searching for only one word in one column. How can I search for multiple words in multiple columns?
You can either use a single % operator on concatenated columns (col1 || ' ' || col2 % 'searchstring') or use several % comparisons joined with AND (col1 % 'searchstring' AND col2 % 'searchstring').

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.