1

I've tried the following regx (java string format):

^(.*(iOS\\s+[\\d\\.]+|Android\\s+[\\d\\.]+)?.*)$

String to match is :

Some Money 2.6.2; iOS 5.1.1 

It supposes to return three groups :

group[0] :Some Money 2.6.2; iOS 5.1.1
group[1] :Some Money 2.6.2; iOS 5.1.1
group[2] :iOS 5.1.1

but it actually returns these:

group[0] :Some Money 2.6.2; iOS 5.1.1 
group[1] :Some Money 2.6.2; iOS 5.1.1 
group[2] :null

when i change regex as below

^(.*(iOS\\s+[\\d\\.]+|Android\\s+[\\d\\.]+).*)$

but it can't match string like

whatever iS 5.1.1 whatever

What i want to achieve is the regex returns three groups no matter what string likes.The first and second group always to be the entire string . The third group is the substring that matches '(iOS|Android) [\d.]*' if string does contains that part and is null or empty if it doesn't contain.

6
  • 3
    It doesn't match the second group, because it is optional and .* has already consumed the whole string. Commented Mar 3, 2017 at 7:44
  • 1
    I tried solving a problem with regex. Now I have two problems. Commented Mar 3, 2017 at 7:44
  • You can use ^(.*((?:iOS|Android)\s+[\d.]+).*)$ Commented Mar 3, 2017 at 7:44
  • 1
    When everything is optional in the regex, something is wrong with the approach. Look, you could use ^((?:(?!(?:iOS|Android)\s+[\d.]).)*((?:iOS|Android)\s+[\d.]+)?.*)$, but do you really want to? I suggest checking for just (?:iOS|Android)\s+[\d.]+ pattern and if not, take the whole string, else grab the string and the match. Commented Mar 3, 2017 at 7:45
  • Then how can i change the regex to avoid these,since string before "iOS " has no fixed pattern. @SebastianProske Commented Mar 3, 2017 at 7:48

3 Answers 3

2

Maybe you can use the ; delimiter as indication that your iOS 5.1.1 part starts?

Then a pattern may look like .+;\\s+(.+).

  • .+; consumes everything up to the semi-colon
  • \\s+ consumes the spaces between semi-colon and the start of the version string
  • (.+) consumes everything up to the end

If you really only want to match iOS or Android then you might want to add a non capturing group within the (.+) part. A regexp then would look like this: ".+;\\s+((?:iOS|Android).+)".

And here a executable example what a solution may look like. It shows the behaviour of both pattern variants I explained above.

public static void main(String[] args) {
    String input1 = "Some Money 2.6.2; iS 5.1.1 ";
    String input2 = "Some Money 2.6.2; iOS 5.1.1 ";
    String input3 = "Some Money 2.6.2; Android 5.1.1 ";

    String pattern1 = ".+;\\s+(.+)";
    String pattern2 = ".+;\\s+((?:iOS|Android).+)";

    System.out.println(pattern1);
    matchPattern(input1, pattern1);
    matchPattern(input2, pattern1);
    matchPattern(input3, pattern1);
    System.out.println();
    System.out.println(pattern2);
    matchPattern(input1, pattern2);
    matchPattern(input2, pattern2);
    matchPattern(input3, pattern2);
}

private static void matchPattern(String input, String pattern) {
    Pattern p = Pattern.compile(pattern);
    Matcher m = p.matcher(input);
    if(m.matches()) {
        System.out.println(m.group(0));
        System.out.println(m.group(1));
        if(m.groupCount() > 1) {
            System.out.println(m.group(2));
        }
    }
}

Update: Since the target of the question got clearer due to some edits by the author, I feel the need to update my answer. If it is about always getting three groups, the following might be better than working out all possible notation variants:

public static void main(String[] args) {
    String input1 = "Some Money 2.6.2; iS 5.1.1";
    String input2 = "Some Money 2.6.2; iOS 5.1.1";
    String input3 = "Some Money 2.6.2; Android 5.1.1";
    String input4 = "Some Money 2.6.2 iOS 5.1.1";
    String input5 = "Some Money 2.6.2 iOS";
    String input6 = "Some Money 2.6.2";

    String pattern1 = "(.*?((?:iOS|Android)(?:\\s+[0-9\\.]+)?.*)?)";

    System.out.println(pattern1);
    matchPattern(input1, pattern1);
    matchPattern(input2, pattern1);
    matchPattern(input3, pattern1);
    matchPattern(input4, pattern1);
    matchPattern(input5, pattern1);
    matchPattern(input6, pattern1);
}

private static void matchPattern(String input, String pattern) {
    Pattern p = Pattern.compile(pattern);
    Matcher m = p.matcher(input);
    if(m.matches()) {
        System.out.println(m.group(0));
        System.out.println(m.group(1));
        System.out.println(m.group(2));
        System.out.println();
    }
}

Here the pattern is (.*?(?:((?:iOS|Android)(?:\\s+[0-9\\.]+)?).*)?).

  • .*? consumes everything before the version string. If no version string is available at all it matches the whole input. The Reluctant quantifier is needed here. It takes the shortest match that still matches and so avoids that the whole input is consumed.
  • (?:((?:iOS|Android)(?:\\s+[0-9\\.]+)?).*)? consumes the whole version string and everything that is following.
  • ((?:iOS|Android)(?:\\s+[0-9\\.]+)?) is the group(2) output. It just matches the OS string, iOS or Android, with an optional version suffix consisting of numbers and dot.
Sign up to request clarification or add additional context in comments.

4 Comments

Of cause not, because the regexp is explicitly limited to iOS and Android. I explained the more general approach in my answer while the code uses the more specific one only matching iOS and Android.
Then you just need to add the logic to return the expected results. Moreover, the unanchored regex would be more efficient.
Thanks for you answer. But actually i can't ensure string's format, it may contains no character ';' . What i want to achieve is that the regex return three groups no matter what the string like.
0

please refer this topic about "How a RegEx engine works".

  1. Those based on back-tracking. These often compile the pattern into byte-code, resembling machine instructions. The engine then executes the code, jumping from instruction to instruction. When an instruction fails, it then back-tracks to find another way to match the input.

Your regular expression have many way to match the input. And sadly, it return the other way (not your expected matches).

By removing "?" quantifier from the 2nd group, it becomes "required". Your returned maches will match all required groups.

Comments

0

I finally solved the problem by regex as below.

(.*((?:iOS|Android)\\s+[0-9\\.]+).*|.*)

Comments

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.