-1

I have a text line matching this pattern \d{2}\s?[a-zA-Z]\s?\d{2,5} zero to many times.

That pattern will match char sequences which can be described as following: two digits, followed by one letter and two to five digits. the letter may be separated by exactly one blank, but that´s optional.

Such a sequence can occur every where in the text, together with words and numbers but also solely.

I need to remove everything but the matching sequence in pure regex substitution (not using java loops, streams etc)

Examples:

"foo 12a123" --> "12a123"

"12a123 bar" --> "12a123"

"foo 12a123 bar 32A1234 baz" --> "12a123 32A1234"

"12 a123 bar 32A 1234" --> "12 a123 32A 1234"

"foo bar" (no match) --> ""

Any ideas?

4
  • 2
    Like this using a capture group stackoverflow.com/questions/38132407/… see regex101.com/r/uUXEYT/1 Commented Jun 21, 2024 at 19:28
  • 1
    mate, I´ve tried that example already without success, but now, your comment made me to try that one again, and guess what? it woks! thank you a lot! Commented Jun 21, 2024 at 19:40
  • @Thefourthbird, the OP wants "foo 12a123 bar 32A1234 baz" --> "12a123 32A1234", not --> "12a12332A1234". Isn't that slightly different? Commented Jun 22, 2024 at 5:16
  • what should be found in "12a12345b456 foo" or in "12a123bar" ? Commented Jun 22, 2024 at 7:33

3 Answers 3

3

You could use replaceAll() with (?i)(\d{2}\s?[a-z]\s?\d{2,5})|.:

public class RegularExpression {
    public static void main(String[] args) {
        String input = "12a123 bar 12a123 bar 12a123 bar";
        String pattern = "(?i)(\\d{2}\\s?[a-z]\\s?\\d{2,5})|.";

        System.out.println(input.replaceAll(pattern, "$1"));
    }
}

Prints

12a12312a12312a123

Details

  • (?i): insensitive flag.
  • (\\d{2}\\s?[a-z]\\s?\\d{2,5}): the capture group that we'd want to keep.
  • |.: the charachter that we don't want to keep.

Comments

  • You can check the link that I posted in the comments. The thing is that .+? matches at least 1 or more characters in a non greedy way. But there is nothing following after it so the engine can settle for just 1 character :-) – The fourth bird

  • Did you know you can use the back reference as a second argument to the pattern in replaceAll? result = input.replaceAll("(?i)(\\d{2}\\s*[a-z]\\s*\\d{2,5})|.","$1"); – WJS

Sign up to request clarification or add additional context in comments.

11 Comments

@Thefourthbird My bad
You can check the link that I posted in the comments regex101.com/r/uUXEYT/1 The thing is that .+? matches at least 1 or more characters in a non greedy way. But there is nothing following after it so the engine can settle for just 1 character :-)
@Thefourthbird Thanks! Updated. Didn't see your comment.
please note the \s? in my pattern - they are essential and can not be substituted with \s*. that change makes the pattern also matching for more than one blank before and/or after the letter between the numbers.
@ibexit. You might want to edit your question and explain what is and is not permitted. It is really hard to know all the constraints. Remember that others will search SO and use this question and results for similar requirements.
|
2

results streams the MatchResults where you can grab the matched pattern. Then collect into a string.

String  foo = "foo 12a123 bar 32A1234 baz";
String pat = "\\d{2}\\s?[a-zA-Z]\\s?\\d{2,5}";
        String result = Pattern.compile(pat).matcher(foo)
             .results()
             .map(mr->mr.group())
             .collect(Collectors.joining(" "));
             System.out.println(result);

prints

12a123 32A1234

2 Comments

thank you, but as i mentioned, no loops (or streams)
@ibexit Well, you said no loops. etc could mean a whole bunch of things. Sorry it doesn't work for you though.
0

I like WJS answer using streams. Here's an alternative using more old-school loops.

import java.util.regex.Matcher;
import java.util.regex.Pattern;
public class RegexExtractor {
    public static void main(String[] args) {
        String regex = "\\d{2}\\s?[a-zA-Z]\\s?\\d{2,5}"; 
        String inputString = "12 a123 bar 32A 1234"; 

        String result = extractMatches(regex, inputString);
        System.out.println(result);
    }

    public static String extractMatches(String regex, String input) {
        Pattern pattern = Pattern.compile(regex);
        Matcher matcher = pattern.matcher(input);

        StringBuilder resultBuilder = new StringBuilder();
        boolean firstMatch = true;
        while (matcher.find()) {
            if (!firstMatch) {
                resultBuilder.append(" ");
            }
            resultBuilder.append(matcher.group());
            firstMatch = false;
        }

        return resultBuilder.toString();
    }
}

2 Comments

thank you, but as i mentioned, no loops
Oops, missed that. Apologies!

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.