I have 54 lists consisting of words of varying lengths. For example:
1 = ["fly", "robot", "ketchup"].
2 = ["rain", "fly", "top", "jacket"].
....
I would like to cluster similar lists into groups based on the words in each list. The order of the words in the list does matter slightly but isn't the only criteria for a match. Any ideas? I was thinking of using BERT and then K-means clustering.
I want the lists to remain intact, just grouped/clustered.