1

I have following query which is filtering results using sub queries with NOT IN how can I avoid NOT IN? Is there any alternative for NOT IN? How can I optimise overall query?

One option I see adding index on lower(display_name) but Is the any better or further optimisation I can do here?

SELECT  table_name.* FROM table_name 
WHERE (
        table_name.id IN (
        SELECT distinct on (name, mode, formulation, strength, generic_name) id FROM table_name 
        where lower(display_name) 
          NOT IN(SELECT lower(display_name) FROM table_name WHERE table_name.user_id IS NULL)
      ) 
ORDER BY table_name.created_at DESC LIMIT 100 OFFSET 0

Query Plan

------------------------------------------------------------------------------------------------
 Limit  (cost=9277.38..9277.38 rows=1 width=296)
   ->  Sort  (cost=9277.38..9351.72 rows=29736 width=296)
         Sort Key: table_name.created_at DESC
         ->  Nested Loop  (cost=8015.41..9128.70 rows=29736 width=296)
               ->  HashAggregate  (cost=8015.12..8017.12 rows=200 width=4)
                     Group Key: table_name_1.id
                     ->  Unique  (cost=7486.65..7932.69 rows=6594 width=93)
                           ->  Sort  (cost=7486.65..7560.99 rows=29736 width=93)
                                 Sort Key: table_name_1.name, table_name_1.mode, table_name_1.formulation, table_name_1.strength, table_name_1.generic_name
                                 ->  Seq Scan on table_name table_name_1  (cost=2286.19..5277.28 rows=29736 width=93)
                                       Filter: (NOT (hashed SubPlan 1))
                                       SubPlan 1
                                         ->  Bitmap Heap Scan on table_name table_name_2  (cost=106.16..2272.68 rows=5402 width=32)
                                               Recheck Cond: (user_id IS NULL)
                                               ->  Bitmap Index Scan on index_table_name_on_user_id  (cost=0.00..104.81 rows=5402 width=0)
                                                     Index Cond: (user_id IS NULL)
               ->  Index Scan using table_name_pkey on table_name  (cost=0.29..6.61 rows=1 width=296)
                     Index Cond: (id = table_name_1.id)

What I am trying to do is table_name is drugs which holds two list like

  1. System wide drugs like added by admins in this case we keep user_id as null.
  2. User specific drugs means added by user with user_id value

I am returning two results in search 1. admin drugs 2. all users drugs like

Admin list

admin_list = select * from table_name user_id IS NULL and LOWER(table_name.name) like %cal%

All display names from admin list which need to skip in following query

SELECT lower(display_name) FROM table_name WHERE table_name.user_id IS NULL

All users drugs but skipping / filtering out admin list based on display_name like

user_list = SELECT  table_name.* FROM table_name 
WHERE (
        table_name.id IN (
        SELECT distinct on (name, mode, formulation, strength, generic_name) id FROM table_name 
        where lower(display_name) 
          NOT IN(SELECT lower(display_name) FROM table_name WHERE table_name.user_id IS NULL)
      ) 
AND LOWER(table_name.name) LIKE '%cal%'
ORDER BY table_name.created_at DESC LIMIT 100 OFFSET 0

Goal to have two distinct lists admin_list and user_list where user_list should not contain any admin list display_name

4
  • Can you explain what the logic is doing? Commented Oct 17, 2020 at 11:45
  • I have table called dugs, which have list of drugs. drug can belong to System wide for all users with user_id null, same drug can be user specific with user_id = 1. I want search/return two things 1. search in system with user_id = null and search in rest of the list i.e. user_id not null but skipping system wide drugs with same display name Commented Oct 17, 2020 at 13:00
  • @ParmodShinde . . . Such an explanation belongs in the question, not in a comment. Commented Oct 17, 2020 at 13:03
  • Updated in question Commented Oct 17, 2020 at 13:13

2 Answers 2

2

More often than not, NOT EXISTS if faster than NOT IN

SELECT t1.* 
FROM table_name t1
WHERE table_name.id IN (SELECT distinct on (t2.name, t2.mode, t2.formulation, t2.strength, t2.generic_name) t2.id 
                        FROM table_name t2
                        where not exists (select * 
                                          FROM table_name t3
                                          WHERE t3.user_id IS NULL
                                            and t2.lower(display_name) = t3.lower(display_name) )
                       ) 
ORDER BY table_name.created_at DESC 
LIMIT 100 OFFSET 0

An index on table_name ( (lower(display_name)) ) where user_is null should improve that even more.

Note that distinct on () without an ORDER BY is typically not a good idea.

Sign up to request clarification or add additional context in comments.

Comments

0

You seem to want one id per name, mode, formulation, strength, generic_name combination where the display_name has no null user_id.

If that is correct, I'm sure there are simpler ways to write the query:

select t.*
from (select distinct on (name, mode, formulation, strength, generic_name) t.* 
      from (select t.*,
                   count(*) filter (where user_id is null) over (partition by lower(display_name)) as cnt
            from table_name
           ) t
      where cnt = 0
      order by name, mode, formulation, strength, generic_name, created_at desc
     ) t
order by created_at desc
limit 100 offset 0

Comments

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.