2

Is there a simple way in C# to compare two strings and find out the percentage of similarity between the two? Say you have a string "I like Bing" and "I like Google" it would compare the words "I" "Like" "Bing" with the words "I" "Like" "Google" then would say that 2/3 of it was the same, and would return .66

4
  • do you want to do string alignment, or just compare one by one? Commented Mar 20, 2011 at 22:09
  • 1
    What's the definition of the similarity you are looking for? Commented Mar 20, 2011 at 22:10
  • 1
    What kind of similarity? Are you looking for character-to-character or patterns like "my name is marlon" and "my brother is marlon". Both will yield different results. Commented Mar 20, 2011 at 22:11
  • Your description of the problem is still a bit vague. What about case sensitivity? Punctuation? What if a word appears twice in one and once in the other? Commented Mar 20, 2011 at 22:22

2 Answers 2

5

The Damerau–Levenshtein distance is probably the most common implementation I've seen. Should be simple enough to implement in C# given the samples on the Wikipedia page.

Sign up to request clarification or add additional context in comments.

5 Comments

And it's Jonas FTW! Awesome link :)
en.wikipedia.org/wiki/Needleman-Wunsch_algorithm This might be more relevant if one wants to do sequence alignment though..
I'm going to have to favorite this question just for the excellent assortment of links being provided here. These may easily prove useful in some of my employer's projects.
2

A couple of approaches you might check out are Levenshtein Distance and a Soundex Algorithm.

Comments

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.