3

here is a topic I have been struggling on for a small while now.

For now, I decided to just limit the requests sizes, but I'd like to know why this happened. If you can take some time solving this problem, thank you.

Context

Here are the models:

class Zone(Model):
    zone_id = BigAutoField(primary_key=True)
    type = CharField(max_length=6, choices=('TypeA', 'TYpeB', 'TypeC'))
    other_fields

class User(Model):
    user_id = BigAutoField(primary_key=True)
    other_fields

class Access(Model):
    access_id= BigAutoField(primary_key=True)
    zone_id = BigIntegerField(null=False, empty=False)
    user_id = BigIntegerField(null=False, empty=False)

There are two databases. These models are on the one I do not own, I only have USAGE rights. That is why I did not add ForeignKeyField, fearing that Django would want to consider its own automatically generated indexes with this, while that's not how the database has been set.

The database is using PostgreSQL. The table Zone has its indexes zone_access_id_idx and zone_user_id_idx set already, and that is why, I suppose, when I run the predictable JOIN sql query (detailed below) on psql console and even on simple python, independent script using psycop2, the result is instantaneous. (the size of zone table is like 100,000 in total, but I filter it to 100 in the query) (the nb of accesses I want is like 4,000, while the table length is like 40,000,000 in total)

class AccessViewset(ModelViewset):
    def list(self, request, *args, **kwargs):
        user_fields = [f.name for f in User._meta.get_fields()]
        zone_fields = [f.name for f in Zone._meta.get_fields()]
        as = Access.objects.raw(f"""
            SELECT *,
            z.other_fields,
            u.other_fields
            FROM access a
            JOIN zone z
            ON a.zone_id=z.zone_id
            JOIN user u
            ON a.user_id=u.user_id
            WHERE z.type in ('TypeA', 'TypeB')
            LIMIT 500
            """)
        as = list(as)  # this takes 16sec

I cannot use Access.objects.filter(zone__type__in=[]) given what was mentioned before.

Problem

When I run this query using django, it takes much much longer (16secs for 500 results). And when it's via psql, it's instantaneous.

I tried

Using psycopg2.connect().execute() instead of Access.objects.raw()

Adding these to the access model, and faking a migration, did not improve the query

    class Meta:
        managed = True
        indexes = [
            models.Index(fields=["zone_id"], name="zone_id_idx", db_tablespace="access_zone_id_idx"),
            models.Index(fields=["user_id"], name="user_id_idx", db_tablespace="access_user_id_idx")
        ]

Adding sshmode='disable' with the psycopg2 method.

2
  • 1
    "There are two databases" -> how does this relate to the question? Does the problem occur on both? Are they identical (especially in terms of data and indexes)? Commented Nov 25, 2020 at 13:53
  • @snakecharmerb It does not relate directly, I'd say. It was more about detailing the context. So that people understand easily that I use: one simple database for django-related tables, I own this one and can update it using django (and ForeignKeyFields, for instance). one database that I don't own, where I only Create and Read rows in some tables. I hoped this could trigger some ideas in the mind of some people that could have had the same environment of work. Commented Nov 26, 2020 at 13:03

0

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.