Why PostgreSQL not using index properly?

Question

Schema:

create table records(
  id         varchar,
  updated_at bigint
);
create index index1 on records (updated_at, id);

Query. It iterates over recently updated records. Fetches 10 records, remembers the last one and then fetches next 10 and so on.

select * from objects
where updated_at > '1' or (updated_at = '1' and id > 'some-id')
order by updated_at, id
limit 10;

It uses index, but it doesn't uses it wisely and also applies filter and processes tons of records, see Rows Removed by Filter: 31575 in query explanation below.

The strange thing is that if you remove or and leave either left or right condition - it works well for both. But it seems like if can't figure out how to apply index correctly if both conditions are used simultaneously with or.

Limit  (cost=0.42..19.03 rows=20 width=1336) (actual time=542.475..542.501 rows=20 loops=1)
   ->  Index Scan using index1 on records  (cost=0.42..426791.29 rows=458760 width=1336) (actual time=542.473..542.494 rows=20 loops=1)
         Filter: ((updated_at > '1'::bigint) OR ((updated_at = '1'::bigint) AND ((id)::text > 'some-id'::text)))
         Rows Removed by Filter: 31575
 Planning time: 0.180 ms
 Execution time: 542.532 ms
(6 rows)

Postgres version is 9.6

... where updated_at > '1' ... You should not quote integer literals. — wildplasser
– wildplasser, Commented Sep 24, 2017 at 10:44

David Aldridge · Accepted Answer · 2017-09-24 11:03:34Z

2

I would try this as two separate queries, combining their results like this:

select *
from
  (
    select   *
    from     objects
    where    updated_at > 1
    order by updated_at, id
    limit    10
    union all
    select   *
    from     objects
    where    updated_at = 1
      and    id > 'some-id'
    order by updated_at, id
    limit    10
  ) t
order by updated_at, id
limit    10

My guess is that the two queries would each optimise pretty well and running both would be more efficient than the current one.

I would also make those columns NOT NULL if possible.

answered Sep 24, 2017 at 11:03

David Aldridge

52.5k8 gold badges73 silver badges99 bronze badges

Sign up to request clarification or add additional context in comments.

1 Comment

Alex Craft Over a year ago

Yea, I also thought about that. But I thought PostgreSQL is smart enough and maybe there's some mistake in my code...

bilelovitch · Accepted Answer · 2017-09-24 11:04:18Z

2

There is an optimization of the calls to the index made by PostgreSQL.

For example, given an index on (a, b, c) and a query condition WHERE a = 5 AND b >= 42 AND c < 77, the index would have to be scanned from the first entry with a = 5 and b = 42 up through the last entry with a = 5. Index entries with c >= 77 would be skipped, but they'd still have to be scanned through. This index could in principle be used for queries that have constraints on b and/or c with no constraint on a — but the entire index would have to be scanned, so in most cases the planner would prefer a sequential table scan over using the index.

https://www.postgresql.org/docs/9.6/static/indexes-multicolumn.html

answered Sep 24, 2017 at 11:04

bilelovitch

2,2151 gold badge20 silver badges24 bronze badges

Collectives™ on Stack Overflow

Why PostgreSQL not using index properly?

2 Answers 2

1 Comment

Comments

Your Answer

Hot Network Questions

Collectives™ on Stack Overflow

2 Answers 2

1 Comment

Comments

Your Answer

Sign up or log in

Post as a guest

Related