0

I need some help in Java regex.

My text is this abc abc abc xyz xyz xyz. I need to find all matches which have only one word between abc and xyz

Here there are the two matches that should be returned:

  1. abc abc xyz ==> an abc is present between abc and xyz
  2. abc xyz xyz ==> an xyz is present between abc and xyz

My regex pattern:

abc\s+([a-z]*?)\s+xyz

It seems to match only the first match which is abc abc xyz. It does not find abc xyz xyz

What is the right pattern to match all?

13
  • I don't understand your expected matches. Can you edit your question and show us what you perceive all matches to be? Commented Aug 16, 2018 at 4:28
  • I agree with Tim, it's not clear what you expect. If think you're looking for a lot more matches. Commented Aug 16, 2018 at 4:31
  • guys, please check it is very clear. actually there are 2 matches Commented Aug 16, 2018 at 4:33
  • @TimBiegeleisen, it is updated now. Commented Aug 16, 2018 at 4:34
  • as i said just one word Commented Aug 16, 2018 at 4:35

4 Answers 4

4

If you just need the one word in between and not the full match as you stated in the comments, you can use a positive Lookbehind and a positive Lookahead, like this:

(?<=abc\s)[a-z]+(?=\sxyz)

Here's a demo.


If you do need the full match or you expect to have multiple spaces before/after the word, you might want to check Andreas's answer.

Sign up to request clarification or add additional context in comments.

4 Comments

FYI: This cannot handle multiple spaces between words. It also only returns the word (e.g. abc), not the entire match (e.g. abc abc xyz), as requested in the question.
@Andreas I agree with the first point. For the second point, check the comments above.
You should answer the question. If you answer a comment, you should clarify that in your answer, e.g. with a quote of and/or link to the comment.
@Andreas Done :)
1

Here is a regex that can handle multiple spaces, can tell you where the full match was found, where the words itself was found, and doesn't require resetting the Matcher to continue the search.

String input = "abc  abc  abc  xyz  xyz  xyz";

Pattern p = Pattern.compile("abc(?=(\\s+([a-z]+)\\s+xyz))");
for (Matcher m = p.matcher(input); m.find(); ) {
    String match = m.group() + m.group(1);
    String word = m.group(2);
    System.out.printf("%d-%d: %s%n", m.start(), m.end(1), match);
    System.out.printf("  %d-%d: %s%n", m.start(2), m.end(2), word);
}

Output

5-18: abc  abc  xyz
  10-13: abc
10-23: abc  xyz  xyz
  15-18: xyz

It works by only matching the leading abc directly, then matching the rest in a zero-width positive lookahead, capturing the entire look-ahead match, so the "full" match can be built. This allows the second search result start matching with the word previously matched.

For extra bonus points, it also captured just the word itself.

You can then choose whether you want the full match, or just the word.

Comments

1

Perhaps you need some modifications while matching:

public static void main(String... args) {
    String s = "abc abc abc xyz xyz xyz";
    Pattern pattern = Pattern.compile("(abc\\s+\\w+\\s+xyz)");
    Matcher matcher = pattern.matcher(s);
    while (matcher.find()) {
        System.out.println(matcher.group(1));
        s = s.substring(matcher.start() + 1); // ignore the just-matched and move on;
        matcher = pattern.matcher(s);
    }
}

Output:

abc abc xyz
abc xyz xyz

Comments

-1

You forgot delimiters and repeating the inner element. Here's an link for online demo

^abc\s+(?:([a-z]*?)\s+)+?xyz$

And making the inner element finder less greedy

Comments

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.