Postgres WHERE and SORT BY optimization

Question

I have a very large table with 100M+ rows. I am trying to find if there is a faster way to execute the following.

Query:

SELECT *
FROM "public".example
WHERE a = "foo" and b = "bar"
order by c /* could be any of fields c to z */
limit 100;

Here is the table and indexes I have setup now.

Table:

id
a (string)
b (string)
c ... z (all integers)

Indexes:

"example_multi_idx" btree (a, b)
"c_idx" btree (c)

Thoughts:

If I was only sorting by c then an index of "example_multi_idx_with_c" btree (a, b, c) performs wonderfully. However, if I throw in a variety of sort by's, then I would need to create n number of multi-key indexes, which seems wasteful.

Gordon Linoff · Accepted Answer · 2018-01-25 18:29:53Z

2

For this query:

SELECT *
FROM "public".example
WHERE a = "foo" and b = "bar"
order by c /* could be any of fields c to z */
limit 100;

The optimal index is example(a, b, c). Postgres should be able to use the index for sorting.

If you want to have multiple possible columns for the order by, you need a separate index for each one.

answered Jan 25, 2018 at 18:29

Gordon Linoff

1.3m62 gold badges706 silver badges857 bronze badges

Sign up to request clarification or add additional context in comments.

2 Comments

nakkor Over a year ago

I agree, however, how about if there are N number of order by options like I noted in my last bullet point? Seems wasteful to create an multi-key index for every possibility. But, that may just be the way of it unless there is a sub query or other fancy footwork that can be done?

Gordon Linoff Over a year ago

@nakkor . . . If you want the index to be used for each condition, that is about the only choice you have. Sorting a relatively small amount of data should not take too long. So if the other conditions are highly selective, you can let Postgres do the sorting.

Tom H · Accepted Answer · 2018-01-25 18:40:58Z

1

How large are the groups once you've filtered by a and b? While including c in the index will certainly help improve performance, if your groups are not particularly large then the sorting at the end of the operation shouldn't have a big cost.

Are you having performance issues with your current indexing?

answered Jan 25, 2018 at 18:40

Tom H

47.6k15 gold badges90 silver badges134 bronze badges

1 Comment

nakkor Over a year ago

The resulting sets are less than 10k. The sort cost from an analyze seems to be around 200ms, while with a full index is 1ms. I am hoping to find a good solution in between that doesn't cause a huge bloat of index sizes because of each multi index repeating a and b over and over.

Kireeti K · Accepted Answer · 2024-01-12 09:51:13Z

0

Having an index directly on the order by column will work in most cases. Because Postgres can then walk on the order by column index and match each row with the filters you provide and pick the first 100.

answered Jan 12, 2024 at 9:51

Kireeti K

1,5401 gold badge21 silver badges32 bronze badges

Collectives™ on Stack Overflow

Postgres WHERE and SORT BY optimization

3 Answers 3

2 Comments

1 Comment

Comments

Your Answer

Hot Network Questions

Collectives™ on Stack Overflow

3 Answers 3

2 Comments

1 Comment

Comments

Your Answer

Sign up or log in

Post as a guest

Related