11

I know this has been asked but I am unable to fix it

For a book object with body (spanish): "quiero mas dinero" (actually quite a bit longer)

My Matcher keeps returning 0 for:

    String s="mas"; // this is for testing, comes from a List<String>
    int hit=0;
    Pattern p=Pattern.compile(s,Pattern.CASE_INSENSITIVE);
    Matcher m = p.matcher(mybooks.get(i).getBody());
    m.find();
    System.out.println(s+"  "+m.groupCount()+"  " +mybooks.get(i).getBody());
    hit+=m.groupCount();

I keep getting "mas 0 quiero mas dinero" on console. Why oh why?

5
  • 2
    There are no capturing groups in your pattern, so .groupCount() returns zero. Note that this does not return how many matches were found. Commented Sep 13, 2012 at 20:10
  • how could i then find the number of "mas" (or any other) words in a string without looping? Commented Sep 13, 2012 at 20:12
  • I'm not aware of anything in Standard Java that will let you do that. What's wrong with looping? int count = 0; for (; m.find(); count++); should give you what you want. Commented Sep 13, 2012 at 20:18
  • nothing really, i just thought there was a single method (still learning java) which is always cleaner to read Commented Sep 13, 2012 at 20:22
  • As of Java 8, you might find it useful to use Pattern.splitAsStream().count(). Commented Feb 20, 2018 at 12:51

4 Answers 4

11

From the javadoc of Matcher.groupCount():

Returns the number of capturing groups in this matcher's pattern.
Group zero denotes the entire pattern by convention. It is not included in this count.

If you check the return value from m.find() it returns true, and m.group() returns mas, so the matcher does find a match.

If what you are trying to do is to count the number of occurances of s in mybooks.get(i).getBody(), you can do it like this:

String s="mas"; // this is for testing, comes from a List<String>
int hit=0;
Pattern p=Pattern.compile(s,Pattern.CASE_INSENSITIVE);
Matcher m = p.matcher(mybooks.get(i).getBody());
while (m.find()) {
    hit++;
}
Sign up to request clarification or add additional context in comments.

2 Comments

Thanks Keppil, this worked great and I was aware of this solution, I just thought that there was a single method to do this (no looping). have to wait a couple of mins before marking as right (stackoverflow thing)
@dhomes This answer: stackoverflow.com/a/23244470/2818583 provides more clarity
2

How could I then find the number of "mas" (or any other) words in a string without looping?

You could use StringUtils in Apache Commons:

int countMatches = StringUtils.countMatches("quiero mas dinero...", "mas");

1 Comment

will do! got to give credit to Keppil though as his cleared up some core java concepts for me
0

You can add parenthesis in the regExp, then it is "(mas)" in your example.

Comments

0

You can add parenthesis in the regExp, then it is "(mas)" in your example.

That way is not suitable for this task. It shows number of capturing groups contain result of Matcher m. In this case even if pattern is "(mas)" for input text like "mas mas" m.groupcount() show 1 - one and only groop for both matches.

So first response is correct and the only possible for the purpose of matches counting.

Comments

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.