For example, I have a list of terms and a string:
var terms = { "programming language", "programming", "language" };
var content = "A programming language is a formal language that "
+ "specifies a set of instructions that can be used to "
+ "produce various kinds of output.";
I can use Regex.Matches(content, term).Count to count that there are 4 times the list appear in the string:
- "programming language": 1 time
- "programming": 1 time
- "language": 2 times
But there are duplicates, there should be only 2 occurrences.
My current solution is to save the begin index and end index of each occurrence, then compare to the saved occurences wherever it is in range and has already been count. Is there a better way without using start and end indexes?
(programming language|programming|language)should do what you want, if you do it right.programming languagecounts as one term, if I remove it the current example would return 3 occurrences, not 2 as I expected.Countworked with an accumulator, soprogrammingoccur once andlanguageoccur twice, after summing doesn't it return 3?