0

I have implemented a string matching function in Python utilizing n-grams and similarity ratios. The function signature is as follows:

    # concise version of the function
    def match_strings(strings1, strings2, ngram_n=2, threshold=0):
        similarities = numerators / denominators
        similarities = np.where(similarities > threshold, similarities, 0)

The function is designed to compare two lists of strings (strings1 and strings2) and return matches based on their similarity ratio, with an optional threshold parameter.

However, I'm encountering an issue where even when I set the threshold parameter to 0, indicating that strings should match regardless of dissimilarity, certain string pairs fail to match.

For example, when strings1 == 'J.S' and strings2 == 'jayanthsjay', the function doesn't recognize them as matches despite the threshold being set to 0.

Could anyone please help me understand why setting the threshold to 0 doesn't seem to enforce matching for all string pairs, and how can I modify the function to achieve this behavior correctly? Thank you!

3
  • 4
    What are numerators and denominators? This code won't run. Commented Feb 26, 2024 at 16:19
  • You probably also want >= instead of >. Commented Feb 26, 2024 at 16:19
  • i only shared a small snippet of the logic inside the function, but yes i did make the changes you mentioned ,much appreciated ! Commented Feb 27, 2024 at 7:21

0

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.