0

I need to select records from PostgreSQL DB. Let's say this is my table (very simplified)

Table1

ID       Name        Surname
1        Stan        Marsh
2        Randy       Marsh
3        Marry       Christmas

I need the results filtered by Name+Surname, so that when the filter string is for example "ar". All the records are returned, ans "sh" would only return the two first. I am using:

select * FROM Table1
WHERE concat(Table1.Name::text, Table1.Surname::text) LIKE '%ar%'

I then use the results in c# as a List of records. And now i am wondering how this works performance wise.

Wouldn't it be better to just iterate over all records in the List and pick the correct ones? What will make the mentioned statement perform worse, or better? A lot of records? Their variety? Or is one of these options just simply better?

Is it bad that i use the mentioned statement every time with "LIKE '%%'" (when there is no filter)?

4
  • 1
    Do you really want to search a concatenation? For example, if you search for %yM%, would you like to see Randy Marsh returned? Commented Jan 10, 2018 at 14:31
  • 2
    This is iterating over all the records and picking the matching ones... Because you're searching in the middle of a string, and because you're concatenating two fields, no indexes will be used in this case. The result is a FULL TABLE SCAN, which internally is just a loop over the whole table... And YES, using CONCAT(a, b) LIKE '%%' is a waste of resources. Commented Jan 10, 2018 at 14:31
  • 1
    Definitely perform search against the database. Dont return every record back to C# and search there. Commented Jan 10, 2018 at 14:32
  • 1
    Use sql filtering rather then C# one. It's much more efficient to do data based comparison in DB. Commented Jan 10, 2018 at 14:49

2 Answers 2

2

As dasblinkenlight points out, maybe that's not what you really want, since you may end up with results that take a part of the name and a part from the surname as the match. If you want to filter both name and surname at the same time, you may prefer to include a space between them:

select * FROM Table1
WHERE concat(concat(Table1.Name::text, ' '), Table1.Surname::text) LIKE '%ar%'

However, the CONCAT function cannot be ignored perfomance-wise. If you don't mind checking separately the name and the surname, this will work faster for you:

select * FROM Table1
WHERE Table1.Name::text LIKE '%ar%' OR Table1.Surname::text LIKE '%ar%'

And for sure, getting all the data and filtering later within C# will always be slower.

Sign up to request clarification or add additional context in comments.

1 Comment

I believe concat(Table1.Name::text, ' ', Table1.Surname::text) would also work.
2

For the most part, I agree with @armarru's answer, however, the one issue I see is that you are shifting a poor performance burden from the client to the database. Presumably the database server is better equipped to handle this, and of course there is the network bandwidth associated with transmitting those results from the server to the application, but either way, SOME system is evaluating all records and filtering down to the ones you want.

The use of the %search string% wildcard prevents the use of any standard index, so you are looking at full-table scans.

But there is good news. There are two fantastic extensions that I think can help. The first is citext that will enable you to perform case-insensitive searches without a performance hit. Once you load the extension, you can make your fields datatype citext instead of text (or varchar or whatever), and searching 'ar' will also return 'Ar' without any nasty upper or lower functions. I think even ilike will clobber your index.

The second is pg_trgm which enables full wildcard searches. A normal B-tree index will support 'like%' searches but not '%like' or '%like%' searches. This extension enables indexed '%like%' searches. It's mind-blowing.

Here is an example of what those indexes look like.

CREATE INDEX Table1_ix1 on table1 using gin (Name gin_trgm_ops);
CREATE INDEX Table1_ix2 on table1 using gin (Surname gin_trgm_ops);

NOW, if you implement armarru's solution:

select * FROM Table1
WHERE Table1.Name::text LIKE '%ar%' OR Table1.Surname::text LIKE '%ar%'

The query can use the new indexes and a bitmap or condition to very quickly bring you the results without a full-table scan.

Unless we are talking trivial amounts of data, this will dwarf any performance you could possibly get by pulling all records onto the client and filtering there.

One other positive comment on armarru's answer is the the OR solution is preferred because it will perform short circuiting and not bother to evaluate the surname if the name results in a true condition.

Comments

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.