1

I need help creating a regular expression that will parse the following string :

09-22-11 12:58:40       SEVERE       ...ractBlobAodCommand:104           -   IllegalStateException: version:1316719189017 not found in recent history                             Dump: /data1/aafghani/dev/devamir/logs/dumps/22i125840.dump

The most difficult part for me is parsing out the date. I'm not really an expert on Java regular expressions - any help is appreciated.

2
  • What fields do you want? What should the fields be from the example? How does the data vary? Commented Oct 8, 2011 at 21:43
  • Hi TJ - I'd like the date, the severity, the classname and line number, the exception message, and the dump file. Commented Oct 8, 2011 at 21:45

5 Answers 5

4

The question is a bit misleading as it implies the need to parse the date into java.util.Date object or similar. The real question is how to split up the input data into the desired fields:

  • date
  • level
  • location name & line
  • exception name & message
  • dump file

This is one solution using a regular expression.

String pattern = "^(\\d{2}-\\d{2}-\\d{2} \\d{2}:\\d{2}:\\d{2})" // date
    + "[ ]+(SEVERE|WARNING|INFO|CONFIG|FINE|FINER|FINEST)" // level
    + "[ ]+([^:]+):(\\d+)" // location name, location line
    + "[ ]+-[ ]+([^:]+): (.*?)" // exception name, exception message
    + "[ ]+Dump: ([a-zA-Z0-9\\./]+)" // dump
    + "$";

Pattern regex = Pattern.compile(pattern);
String input = "09-22-11 12:58:40       SEVERE       ...ractBlobAodCommand:104           -   IllegalStateException: version:1316719189017 not found in recent history                             Dump: /data1/aafghani/dev/devamir/logs/dumps/22i125840.dump";
Matcher m = regex.matcher(input);
assertTrue(m.matches());
assertSame(7, m.groupCount());
for (int i = 1; i <= m.groupCount(); i++) {
  System.out.format("[%d] \"%s\"%n", i, m.group(i));
}

Output

[1] "09-22-11 12:58:40"
[2] "SEVERE"
[3] "...ractBlobAodCommand"
[4] "104"
[5] "IllegalStateException"
[6] "version:1316719189017 not found in recent history"
[7] "/data1/aafghani/dev/devamir/logs/dumps/22i125840.dump"
Sign up to request clarification or add additional context in comments.

2 Comments

How can I make the dump file optional TJ? is it with the ( ) ?
+ "[ ]+(?:(?:Dump: )([a-zA-Z0-9\\./]+))?" // dump
3

Don't parse a date with regular expressions. Instead use a SimpleDateFormat object.

e.g.,

import java.text.ParseException;
import java.text.SimpleDateFormat;
import java.util.Date;
import java.util.regex.Matcher;
import java.util.regex.Pattern;

public class Foo001 {
   public static void main(String[] args) {
      String test = "    09-22-11 12:58:40       SEVERE       ...ractBlobAodCommand:104           -   IllegalStateException: version:1316719189017 not found in recent history                             Dump: /data1/aafghani/dev/devamir/logs/dumps/22i125840.dump";

      Pattern pattern = Pattern.compile("(?<=^\\s+)\\d[\\d -:]+\\d+(?=\\s+)");
      Matcher matcher = pattern.matcher(test);
      if (matcher.find()) {
         String dateString = matcher.group();

         SimpleDateFormat sdf = new SimpleDateFormat("MM-dd-yy HH:mm:ss");

         try {
            Date date = sdf.parse(dateString);
            System.out.println(date);
         } catch (ParseException e) {
            e.printStackTrace();
         }
      }


   }
}

1 Comment

But first I need to get the date object parsed out.
2

Are you sure that's what you need? I'd consider splitting the string on delimiters or columns and using existing date parsing libs to do the heavy lifting.

Comments

1

if you want to extract the date (without timestamp) out:

^\d{2}-\d{2}-\d{2}

in java, it should be

String regex = "^\\d{2}-\\d{2}-\\d{2}"

Comments

1

You can use for the date:

^\d\d-\d\d-\d\d

Comments

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.