1

I have an input string in the following format

String input = "00IG356001110002005064007000000";

Characters 3-7 is the code.

Characters 8-12 is the amount.

Based on the code in the input string (IG356 in the sample input string), i need to capture the amount(00111 in the sample). The value in the amount (characters 8-12) should be picked up only for specific codes and the logic is detailed below.

  1. The code should not be SG356. If it is SG356, not a match and exit.
  2. a. If the code is not SG356, check if the codes are IG902 or SG350, in this case capture the amount(00111)

    else

    b. Check for the 3 numbers in the code (characters 5-7, 356 in this sample). If they are 200,201,356,370. go ahead and capture the amount

I am using the regular expression shown below: Using positive lookahead and if then else construct.

String regex= ".{2}(?!SG356)((?=IG902|SG350).{5}(.{5}).+|.{2}(?=200|201|356|370).{3}(.{5}).+)";

The regular expression works fine if the code in the input string is IG902 or SG350 (when the 'if' part of the regex is getting matched). but if the 'else' is getting matched, i am unable to capture the amount.

This regular expression is working fine while just checking for a match.

.{2}(?!SG356)((?=IG902|SG350).+|.{2}(?=200|201|356|370).+) 

The problem is only while capturing the group. I am running this in Java. Any help would be greatly appreciated.

The java code i am using is :

public String getTsqlSum(String input, String regex){
     String value = null;
     Matcher m = Pattern.compile(regex).matcher(input);
     System.out.println("Group Count: " + m.groupCount());
     if (m.matches()) {
    for (int i=0;i<m.groupCount();i++){
        System.out.println("For i: " + i +" Value: " + m.group(i));
         }
     }
     return value;
}

public void forumTest(){
    //String input = "00IG902001110002005064007000000";
         String input = "00IG356001110002005064007000000";
    String regex= ".{2}(?!SG356)(?:(?=IG902|SG350).{5}|.{2}(?=200|201|356|370).{3})(.{5}).+";
    System.out.println(match(input, regex));
    String match = getTsqlSum(input, regex);
    System.out.println("Match: " + match);
}
2
  • 4
    Why use regex? As the format of the String is fixed using String operations would be simpler. Commented Jan 31, 2012 at 11:49
  • Yes, i agree it is very easy to do this using string operations. But i am using a custom tool build in java. So, i have to use regex for this. Commented Jan 31, 2012 at 12:19

1 Answer 1

2

The regular expression works fine if the code in the input string is IG902 or SG350 (when the 'if' part of the regex is getting matched). but if the 'else' is getting matched, i am unable to capture the amount.

You are not unable to capture the amount, the expression is working fine. But if you are in the second part of the alternation (This is not a regex if-then-else) then your result is in a different capturing group. You will find it in the capturing group 3 and not in the second one like when you are matching in the first part of the alternation.

String regex= ".{2}(?!SG356)((?=IG902|SG350).{5}(.{5}).+|.{2}(?=200|201|356|370).{3}(.{5}).+)";
        Group number        1                   2                                   3

In a regular expression the capturing groups are numbered by their opening brackets and this continues also in an alternation. In Perl there would be a construct that gives the capturing groups of an alternation the same number, but I think thats the only flavour that is able to do this.

In Java you need to check after the expression in which group you have the result.

See my answer here, similar topic

You can change your regex and make the alternation before the capturing group

try this

.{2}(?!SG356)(?:(?=IG902|SG350).{5}|.{2}(?=200|201|356|370).{3})(.{5}).+

You will find your result in both cases in the group 1. (I made the first one a non capturing group using the ?:)

Update after the source was added

Your loop is wrong, that means the groups are starting at 1, if you want the content of group one, you have to use m.group(1).

In group m.group(0) you will find the whole matched string.

Try this

for (int i=1;i<=m.groupCount();i++){
    System.out.println("For i: " + i +" Value: " + m.group(i));
}
Sign up to request clarification or add additional context in comments.

9 Comments

In some flavors like .NET and JGSoft you also can use named capturing groups that use the same name.
Hi Stema. Thanks for taking a look. The Groupcount over here is 3. But i am getting the value as null which for the 3rd group. For i: 0 Value: 00IG356001110002005064007000000 For i: 1 Value: IG356001110002005064007000000 For i: 2 Value: null Am not able to find out why the value is null for the 3rd group. Also, is this not how the if then else needs to be written? (referring to your statement (This is not a regex if-then-else)).
@TituJoseph the construct (abc|def) is an alternation (an OR). That means: try to match the first part, if it doesn't match try the second part, if it doesn't match try the third part ....
Ok. Thanks Stema. But, the regex which you had provided also is not returning a value when it is matching the 3rd part.
Also, the input string could be of various forms, it could match the 1st part, 2nd part or the 3rd part. So, if the string is 00IG356001110002005064007000000 (matches the 3rd part), or 00IG902001110002005064007000000 (matches the 2nd part). Both the cases, i need to get the value.
|

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.