8

How do I annotate a Django queryset with a Regex capture group without using RawSQL so that I later can use that value for filtering and sorting?

For example, in PostgreSQL I could make the following query:

CREATE TABLE foo (id varchar(100));

INSERT INTO foo (id) VALUES ('disk1'), ('disk10'), ('disk2');

SELECT
    "foo"."id",
    CAST((regexp_matches("foo"."id", '^(.*\D)([0-9]*)$'))[2] AS integer) as grp2
FROM "foo"
ORDER BY "grp2"

dbfiddle

2 Answers 2

6
+100

From Django 1.8 onwards, you can use Func() expressions.

from django.db.models import Func

class EndNumeric(Func):
    function = 'REGEXP_MATCHES'
    template = "(%(function)s(%(expressions)s, '^(.*\D)([0-9]*)$'))[2]::integer"

qs = Foo.objects.annotate(
    grp2=EndNumeric('id'),
).values('id', 'grp2').order_by('grp2')

Reference: Get sorted queryset by specified field with regex in django

Sign up to request clarification or add additional context in comments.

4 Comments

Just wondering ... is this really database-agnostic ? (If not, I would rather use RawSQL)
Test in your target RDBMS (which, for Django, should be PostgreSQL). Unless you're writing something that's specifically meant for distribution to people or projects that are likely to be using various databases, the often-repeated idea that you may at some point switch back-ends basically never happens unless you've made a fundamental mistake, like building your platform on SQLite.
@kungphu I have exactly that use-case, I'm distributing a project that are used by various databases
@kungphu "project-side", I totally agree; if you're working on a public app, however, that would be a requirement, or at the very least much desirable
6

You can use a custom Func class created to get it working, but I would like to implement in a better way, just like a normal function which could be used for further processing using other functions or annotations or etc. Like a "block" in the Django ORM ecosystem.

I would like to start with an "beta version" of the class which looks like this one:

from django.db.models.expressions import Func, Value

class RegexpMatches(Func):
    function = 'REGEXP_MATCHES'

    def __init__(self, source, regexp, flags=None, group=None, output_field=None, **extra):
        template = '%(function)s(%(expressions)s)'

        if group:
            if not hasattr(regexp, 'resolve_expression'):
                regexp = Value(regexp)

            template = '({})[{}]'.format(template, str(group))

        expressions = (source, regexp)
        if flags:
            if not hasattr(flags, 'resolve_expression'):
                flags = Value(flags)

            expressions += (flags,)

        self.template = template

        super().__init__(*expressions, output_field=output_field, **extra)

and a fully working example for an admin interface:

from django.contrib.admin import ModelAdmin, register
from django.db.models import IntegerField
from django.db.models.functions import Cast
from django.db.models.expressions import Func, Value

from .models import Foo


class RegexpMatches(Func):
    function = 'REGEXP_MATCHES'

    def __init__(self, source, regexp, flags=None, group=None, output_field=None, **extra):
        template = '%(function)s(%(expressions)s)'

        if group:
            if not hasattr(regexp, 'resolve_expression'):
                regexp = Value(regexp)

            template = '({})[{}]'.format(template, str(group))

        expressions = (source, regexp)
        if flags:
            if not hasattr(flags, 'resolve_expression'):
                flags = Value(flags)

            expressions += (flags,)

        self.template = template

        super().__init__(*expressions, output_field=output_field, **extra)


@register(Foo)
class Foo(ModelAdmin):
    list_display = ['id', 'required_field', 'required_field_string']

    def get_queryset(self, request):
        qs = super().get_queryset(request)

        return qs.annotate(
            required_field=Cast(RegexpMatches('id', r'^(.*\D)([0-9]*)$', group=2), output_field=IntegerField()),
            required_field_string=RegexpMatches('id', r'^(.*\D)([0-9]*)$', group=2)
        )

    def required_field(self, obj):
        return obj.required_field

    def required_field_string(self, obj):
        return obj.required_field_string

As you see in I've added 2 annotations and one outputs like a number and the other one like a normal string (character), of course, we don't see it in the admin interface but it does in the SQL are executed:

SELECT "test_foo"."id" AS Col1,
               ((REGEXP_MATCHES("test_foo"."id", '^(.*\D)([0-9]*)$'))[2])::integer AS "required_field", (REGEXP_MATCHES("test_foo"."id", '^(.*\D)([0-9]*)$'))[2] AS "required_field_string"
          FROM "test_foo"

And also a screenshot with an example for you :)

REGEXP_MATCHES admin interface example with configurable group & flags & everything

Github gist with a better source code formatting https://gist.github.com/phpdude/50675114aaed953b820e5559f8d22166

Comments

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.