6

While trying to implement pagination from server side in postgres, i came across a point that while using limit and offset keywords you have to provide an ORDER BY clause on a unique column probably the primary key.

In my case i am using the UUID generation for Pkeys so I can't rely on a sequential order of increasing keys. ORDER BY pkey DESC - might not result in newer rows on top always. So i resorted to using Created Date column - timestamp column which should be unique.

But my question comes what if the UI client wants to sort by some other column? in the event that it might not always be a unique column i resort to ORDER BY user_column, created_dt DESC so as to maintain predictable results for postgres pagination.

is this the right approach? i am not sure if i am going the right way. please advise.

2 Answers 2

6

I talked about this exact problem on an old blog post (in the context of using an ORM):

One last note about using sorting and paging in conjunction. A query that implements paging can have odd results if the ORDER BY clause does not include a field that represents an empirical sequence in the data; sort order is not guaranteed beyond what is explicitly specified in the ORDER BY clause in most (maybe all) database engines. An example: if you have 100 orders that all occurred on the exact same date, and you ask for the first page of this data sorted by this date, then ask for the second page of data sorted the same way, it is entirely possible that you will get some of the data duplicated across both pages. So depending on the query and the distribution of data that is “sortable,” it can be a good practice to always include a unique field (like a primary key) as the final field in a sort clause if you are implementing paging.

http://psandler.wordpress.com/2009/11/20/dynamic-search-objects-part-5sorting/

Sign up to request clarification or add additional context in comments.

3 Comments

thanks for the advise. in my case the issue is a little bit more different as i want new data to appear first, so am using default sort order by of 'created_dt DESC' . now suppose UI user clicks on a column header, 'country' in this case, i would hit server to paginate and sort by country (which might not be unique), so i append created_dt as secondary order by -> ORDER by country, created_dt desc if i assume that created_dt is unique, will this ensure that returned result set follows a predictable pattern for pagination ?
i am not happy with the fact that i need to include a unique-column in my case. i cant use PK, as am using UUID which might not follow a sequential order. and now i will have to create index on 'created_dt' and order by is anyways a heavy operation.
If you sort by something that creates an empirical sequence of the data (PK or created_dt), you will always get predictable results. If you sort by some other, non-unique column, you must also sort by an additional column (or by additional columns) that in combination will create an empirical sequence. There is simply no way around this.
2

The strategy of using a column that uniquely identifies a record as pkey or insertion_date may not be possible in some cases.

I have an application where the user sets up his own grid query then it can simply put any column from multiple tables and perhaps none is a unique identifier.

In a case that can be useful you use rownum. You simply select the rownum and use his sort in over function. It would be something like:

select col1, col2, col3, row_number() over(order by col3) from tableX order by col3

It's important that over(order by *) match with order by *. Thus your paging will have consistent results every time.

Comments

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.