0

I'm writing an endpoint to fetch data from the "Term" model in Django REST framework and I'm trying to reduce queries by prefetching data. Specifically there is a model "TermRelation", that saves vector relation scores between individual terms that I would like to prefetch data from. Simplified, the models look as follows:

models.py

class Term(models.Model):
    term = models.CharField(max_length=255, verbose_name=_('Term'), null=True, db_index=True)

class TermRelation(models.Model):
    src_term = models.ForeignKey(Term, on_delete=models.CASCADE, verbose_name=_('Source term'),
                                    related_name='src_term_relation')
    trg_term = models.ForeignKey(Term, on_delete=models.CASCADE, verbose_name=_('Target term'),
                                    related_name='trg_term_relation')
    vector_sim = models.FloatField(blank=True, null=True, default=0.0, verbose_name=_('Vector similarity'), help_text=_('Cosine vector similarity.'))

And here's the simplified view:

views.py

class TermsList(generics.ListCreateAPIView):
    def get_queryset(self):
        queryset = Term.objects.prefetch_related(
            'src_term_relation',
            'trg_term_relation',
            'note_set',
            'usage_set'
        ).all()
        return queryset

There are other models related to term such as "Note" and "Usage" for which prefetch is working, only for relations it still makes a bunch of queries. I've included a screenshot of the Django SQL debug results, or rather the first few lines as this goes on for a while with the same queries. You can see that Django does run the prefetch operation, but then still makes the same queries as if it didn't happen.

What am I doing wrong? Could this be related to "TermRelation" having two ForeignKey fields pointing to the same model or REST framework not knowing how to resolve the related names?

EDIT:

Think I found something, the issue seems to lie elsewhere. In the serializer, there is a method field that counts the number of relations:

class TermSerializer(serializers.ModelSerializer):
    relations_count = serializers.SerializerMethodField()

    def get_relations_count(self, obj):
        rels = TermRelation.objects.filter(Q(src_term=obj) | Q(trg_term=obj))
        return len(rels)

    class Meta:
        model = Term
        fields = '__all__'

I'm assuming it runs a query over all TermRelations for each term that is returned by the serializer, ignoring the prefetched data. Is there a better way to do this?

1
  • Not related, but it's better to use TermRelation.objects.filter(Q(src_term=obj) | Q(trg_term=obj)).count() instead of doing len(), count request is much faster. Commented Jun 16, 2023 at 10:44

1 Answer 1

0

Try this, may be it would solve additional queries problem

class TermSerializer(serializers.ModelSerializer):
    relations_count = serializers.SerializerMethodField()

    def get_relations_count(self, obj):
        return obj.src_term_relation.count() + obj.trg_term_relation.count()
Sign up to request clarification or add additional context in comments.

Comments

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.