1

I try to incorporate fuzzy serach function in a django project without using Elasticsearch.

1- I am using postgres, so I first tried levenshtein, but it did not work for my purpose.

class Levenshtein(Func):
    template = "%(function)s(%(expressions)s, '%(search_term)s')"
    function = "levenshtein"

    def __init__(self, expression, search_term, **extras):
        super(Levenshtein, self).__init__(
            expression,
            search_term=search_term,
            **extras
        )
items = Product.objects.annotate(lev_dist=Levenshtein(F('sort_name'), searchterm)).filter(
                            lev_dist__lte=2
                        )

Search "glyoxl" did not pick up "4-Methylphenylglyoxal hydrate", because levenshtein considered "Methylphenylglyoxal" as a word and compared with my searchterm "glyoxl".

2. trigram_similar gave weird results and was slow

items = Product.objects.filter(sort_name__trigram_similar=searchterm)
  • "phnylglyoxal" did not pick up "4-Methylphenylglyoxal hydrate", but picked up some other similar terms: "4-Hydroxyphenylglyoxal hydrate", "2,4,6-Trimethylphenylglyoxal hydrate"

  • "glyoxl" did not pick up any of the above terms

3. python package, fuzzywuzzy seems can solve my problem, but I was not able to incorporate it into query function.

ratio= fuzz.partial_ratio('glyoxl', '4-Methylphenylglyoxal hydrate') 

# ratio = 83

I tried to use fuzz.partial_ratio function in annotate, but it did not work.

items = Product.objects.annotate(ratio=fuzz.partial_ratio(searchterm, 'full_name')).filter(
                            ratio__gte=75
                        )

Here is the error message:

QuerySet.annotate() received non-expression(s): 12.

According to this stackoverflow post (1), annotate does not take regular python functions. The post also mentioned that from Django 2.1, one can subclass Func to generate a custom function. But it seems that Func can only take database functions such as levenshtein.

Any way to solve these problems? thanks!

2
  • You can not use partial, since the annotations do not run at the Django/Python level. What the Django ORM does is construct a database query. A function like Levenshtein is thus only some mechanism to write levenshtein(sort_name), etc. in the query, not evaluate it at the Django/Python level. Commented Nov 21, 2020 at 20:02
  • @WillemVanOnsem Thank you for the quick response. Any idea how I should implement fuzzy search then? Commented Nov 21, 2020 at 20:14

0

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.