1

I'm testing fuzzywuzzy's process.extractBests() as follows:

from fuzzywuzzy import process

# Define the query string
query = "Apple"

# Define the list of choices
choices = ["Apple", "Apple Inc.", "Apple Computer", "Apple Records", "Apple TV"]

# Call the process.extractBests function
results = process.extractBests(query, choices)

# Print the results
for result in results:
    print(result)

It outputs:

('Apple', 100)
('Apple Inc.', 90)
('Apple Computer', 90)
('Apple Records', 90)
('Apple TV', 90)

Why didn't the scorer give 100 to all strings since they all 100% contain the query string ("Apple")?

I use fuzzywuzzy==0.18.0 with Python 3.11.7.

1 Answer 1

2

The fuzzywuzzy's extractBests() function does not give 100% because it does not check for a match, it checks for similarity, such as length of string, contents of string compared to the query, positions of the query string, and a few other factors. In your case, it does not output 100% because "Apple Inc." is not an exact match of your query, "Apple". This is why only the "Apple" choice outputs 100%, because it 100% matches with the query, "Apple". I hoped this helped!

Sign up to request clarification or add additional context in comments.

Comments

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.