Inconsistent pagination when sorting by non-unique columns #8840
Unanswered
alextatarinov
asked this question in
Potential Issue
Replies: 1 comment 6 replies
-
lets wait for a consensus on django first. |
Beta Was this translation helpful? Give feedback.
6 replies
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Uh oh!
There was an error while loading. Please reload this page.
Uh oh!
There was an error while loading. Please reload this page.
-
As described in https://code.djangoproject.com/ticket/34251, some database backends (incl. PostgreSQL) will produce non-deterministic ordering when sorting by columns with non-unique values.
The standard DRF setup with OrderingFilter and PageNumber/LimitOffsetPagination invites this problem, and I've been bitten by it several times. The OrderingFilter abstraction makes you think that everything should work just fine with i.e.
ordering_fields = ['name', 'created_date', 'price']
.Is easy to overlook, since such non-unique columns like "name" and "price" can have mostly unique values, and duplicate values have to be at the page borders to observe the issue, so it's more likely to happen in production environments after operating fine for quite a while. Moreover, unless you have a "load more" UI (which was my case), it's not easy to spot duplicates or missing rows across pages.
The current solution requires passing "ordering=field_name,pk" from the API's consumers, which in my opinion is less than ideal.
An alternative option is to enforce total ordering automatically, and this is what Django admin does in contrib/admin/views/main.py#L390 by inserting a PK when ordering by non-unique fields. The simple implementation can look like this
This solution can lead to another problem, stated in this Django ticket, which is unused indexes due to ordering by multiple columns.
I believe this issue affects quite a lot of production systems out there and requires our attention. The behavior is definitely unexpected and hard to notice, while sorting by arbitrary non-unique columns is a commonly used pattern. To my shock, I haven't seen this issue described or discussed pretty much anywhere, meaning almost every DRF project which uses ordering + pagination is affected.
If the linked ticket is accepted, Django will issue a warning in such scenarios, which will propagate to the DRF's PageNumberPagination, but not LimitOffsetPagination (which, on a side note, will also happily accept completely unordered QuerySet). But I don't think the warning is enough since there is currently no reasonable solution from the DRF side.
Beta Was this translation helpful? Give feedback.
All reactions