2

If I have a table test with values like :

id | value
----------------
1  | ABC 1-2-3
2  | AB 1-2-3-4-5
3  | ABC 1
4  | ABC 1-2
5  | ABC

and the input string I'm trying to is ABC 1-2-3-4-5, then the closest substring match (if I could call it that) should be ABC 1-2-3. Row # 2 should not match because it doesn't have the "ABC". I've only been able to search for the string if the input string is shorter than the actual records, but not if it's longer. e.g

select * from test where value ilike 'ABC 1-2%';

but this also does not give me an exact record, but only those starting with ABC 1-2. How do I construct the proper sql statement to solve this?

2 Answers 2

7

You may be interested in pg_trgm extension:

create extension if not exists pg_trgm;

Standard similarities for your data are as follows:

select *, similarity(value, 'ABC 1-2-3-4-5')
from test
order by 3 desc;

 id |    value     | similarity 
----+--------------+------------
  2 | AB 1-2-3-4-5 |        0.8
  1 | ABC 1-2-3    |   0.714286
  4 | ABC 1-2      |   0.571429
  3 | ABC 1        |   0.428571
  5 | ABC          |   0.285714
(5 rows)

However you can always add additional criteria in WHERE clause:

select *, similarity(value, 'ABC 1-2-3-4-5')
from test
where value ilike 'abc%'
order by 3 desc;

 id |   value   | similarity 
----+-----------+------------
  1 | ABC 1-2-3 |   0.714286
  4 | ABC 1-2   |   0.571429
  3 | ABC 1     |   0.428571
  5 | ABC       |   0.285714
(4 rows)    
Sign up to request clarification or add additional context in comments.

1 Comment

I upvoted this because there's an actual measure of similarity and i can probably use this in the future.
3

Reverse the comparison:

select * from test
where 'ABC 1-2-3-4-5' ilike value || '%'
order by length(value) desc

The best (ie longest) matches will be returned first.

4 Comments

It works like magic. How? Will it work similarly on MySQL/Oracle/MSSQL as well?
@Vikas the "magic" is reversing the usual condition of where my_column like some_value to where some_value like my_column%, which in words means "where some_value starts with my_column". Yes, this will work for all databases (but change ilike to like - ilike is Postgres only).
Have you written more on this topic? I would like learn more to create a solution that can stand the test of time. (various combinations of conditions including multiple ILIKE conditions)
@Vikas no, this is all I have written about it

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.