0

Given two lists , I'm calculating a distance between words in a nested for loop:

from fuzzywuzzy import fuzz

l = ['mango','apple']
l2 = ['ola','john']

for i in l:
    for j in l2:
        print(i,j,fuzz.ratio(i,j))


mango ola 25
mango john 22
apple ola 25
apple john 0

I would like to find the maximum value for every element of the outer loop. Result would be:

mango ola 25
apple ola 25

Since the other elements have a lower value.

One strategy that I could think of is to use pandas, but I was thinking rather of a pure python implementation. Pandas way for reference:

from fuzzywuzzy import fuzz
import pandas as pd

l = ['mango','apple']
l2 = ['ola','johnkoo']

result = []
for i in l:
    for j in l2:
        result.append((i,j,fuzz.ratio(i,j)))

df = pd.DataFrame(result,columns = ['word1','word2','distance'])

idx = df.groupby(['word1'])['distance'].transform(max) == df['distance']
print(df[idx])

1 Answer 1

1

The built-in max function allows to select a criterium to sort values (therefore specifying what should be considered maximum) using the keyword argument key. So, you can sort by the third item of each (i,j,fuzz.ratio(i,j)) generated in the inner loop:

for i in l:
    print(max([(i,j,fuzz.ratio(i,j)) for j in l2], key=lambda x: x[2]))

Outputs

('mango', 'ola', 25)
('apple', 'ola', 25)

To format it as in your example: print(' '.join(max([(i,j,str(fuzz.ratio(i,j))) for j in l2], key=lambda x: x[2]))).

Sign up to request clarification or add additional context in comments.

Comments

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.