0

Following is Sample the data on which regex should be applied:

2019-05-27 10:49:18,418 INFO  org.apache.flink.runtime.executiongraph.ExecutionGraph        - Filter -> Map (1/1) (824780055001546646d35df7a64cfe3c) switched from CANCELING to CANCELED.
2019-05-27 10:49:18,418 INFO  org.apache.flink.runtime.executiongraph.ExecutionGraph        - Try to restart or fail the job  (3064130e1dccead0b037f193d3699c3b) if no longer possible.
2019-05-27 10:49:18,418 ERROR  org.apache.flink.runtime.executiongraph.ExecutionGraph        - Job  (3064130e1dccead0b037f193d3699c3b) switched from state FAILING to FAILED.
java.lang.IllegalArgumentException: json can not be null or empty
    at com.jayway.jsonpath.internal.Utils.notEmpty(Utils.java:256)
    at com.jayway.jsonpath.JsonPath.compile(JsonPath.java:424)
    at com.mypkg.subpkg.ControlData$ConnectedStreams.validateJsonPath(ControlData.java:194)
    at com.mypkg.subpkg.ControlData$ConnectedStreams.flatMap1(ControlData.java:178)
    at com.mypkg.subpkg.ControlData$ConnectedStreams.flatMap1(ControlData.java:171)
    at org.apache.flink.streaming.api.operators.co.CoStreamFlatMap.processElement1(CoStreamFlatMap.java:53)
    at org.apache.flink.streaming.runtime.io.StreamTwoInputProcessor.processInput(StreamTwoInputProcessor.java:238)
    at org.apache.flink.streaming.runtime.tasks.TwoInputStreamTask.run(TwoInputStreamTask.java:117)
    at org.apache.flink.streaming.runtime.tasks.StreamTask.invoke(StreamTask.java:300)
    at org.apache.flink.runtime.taskmanager.Task.run(Task.java:711)
    at java.lang.Thread.run(Thread.java:748)
2019-05-27 10:49:18,418 ERROR  org.apache.flink.runtime.executiongraph.ExecutionGraph        - Could not restart the job  (3064130e1dccead0b037f193d3699c3b) because the restart strategy prevented it.
java.lang.IllegalArgumentException: json can not be null or empty
    at com.jayway.jsonpath.internal.Utils.notEmpty(Utils.java:256)
    at com.jayway.jsonpath.JsonPath.compile(JsonPath.java:424)
    at com.mypkg.subpkg.ControlData$ConnectedStreams.validateJsonPath(ControlData.java:194)
    at com.mypkg.subpkg.ControlData$ConnectedStreams.flatMap1(ControlData.java:178)
    at com.mypkg.subpkg.ControlData$ConnectedStreams.flatMap1(ControlData.java:171)
    at org.apache.flink.streaming.api.operators.co.CoStreamFlatMap.processElement1(CoStreamFlatMap.java:53)
    at org.apache.flink.streaming.runtime.io.StreamTwoInputProcessor.processInput(StreamTwoInputProcessor.java:238)
    at org.apache.flink.streaming.runtime.tasks.TwoInputStreamTask.run(TwoInputStreamTask.java:117)
    at org.apache.flink.streaming.runtime.tasks.StreamTask.invoke(StreamTask.java:300)
    at org.apache.flink.runtime.taskmanager.Task.run(Task.java:711)
    at java.lang.Thread.run(Thread.java:748)
2019-05-27 10:49:18,418 INFO  org.apache.flink.runtime.checkpoint.CheckpointCoordinator     - Stopping checkpoint coordinator for job 3064130e1dccead0b037f193d3699c3b.
2019-05-27 10:49:18,418 INFO  org.apache.flink.runtime.checkpoint.StandaloneCompletedCheckpointStore  - Shutting down
2019-05-27 10:49:18,419 INFO  org.apache.flink.runtime.dispatcher.StandaloneDispatcher      - Job 3064130e1dccead0b037f193d3699c3b reached globally terminal state FAILED.

Basically what I want to extract is time stamp and ERROR with message:

For an instance:

TimeStamp               Error
2019-05-27 10:49:18,418 ERROR  org.apache.flink.runtime.executiongraph.ExecutionGraph        - Job  (3064130e1dccead0b037f193d3699c3b) switched from state FAILING to FAILED.
java.lang.IllegalArgumentException: json can not be null or empty
    at com.jayway.jsonpath.internal.Utils.notEmpty(Utils.java:256)
    at com.jayway.jsonpath.JsonPath.compile(JsonPath.java:424)
    at com.mypkg.subpkg.ControlData$ConnectedStreams.validateJsonPath(ControlData.java:194)
    at com.mypkg.subpkg.ControlData$ConnectedStreams.flatMap1(ControlData.java:178)
    at com.mypkg.subpkg.ControlData$ConnectedStreams.flatMap1(ControlData.java:171)
    at org.apache.flink.streaming.api.operators.co.CoStreamFlatMap.processElement1(CoStreamFlatMap.java:53)
    at org.apache.flink.streaming.runtime.io.StreamTwoInputProcessor.processInput(StreamTwoInputProcessor.java:238)
    at org.apache.flink.streaming.runtime.tasks.TwoInputStreamTask.run(TwoInputStreamTask.java:117)
    at org.apache.flink.streaming.runtime.tasks.StreamTask.invoke(StreamTask.java:300)
    at org.apache.flink.runtime.taskmanager.Task.run(Task.java:711)
    at java.lang.Thread.run(Thread.java:748)

Here Error message is split into multiple lines to for that I have written java pattern like below:

((?m)\\d{4}-[01]\\d-[0-3]\\d\\s[0-2]\\d((:[0-5]\\d)?){2}[\\s\\S]*ERROR[\\s\\S]*[ ]*at [\\s\\S]*)

But it returns me all the content of the file.

What should I do to make it work so that it will give me multi-line error message too.

4
  • 1
    I see from your stack trace that appear to be trying to apply regex to a JSON string. Don't do that. Use a JSON parser instead. Commented May 31, 2019 at 4:41
  • Didn't get you, this is basically any stack trace file from which I want to extract multiline error message. Commented May 31, 2019 at 4:46
  • I get you now...you're trying to parse the stack trace itself :-) Commented May 31, 2019 at 4:46
  • Exactly :) you got it. Commented May 31, 2019 at 4:47

2 Answers 2

1

try this

((\d{4}-\d{2}-\d{2}\s\d{2}:\d{2}:\d{2},\d{3,5})\sERROR.+?(?=\d{4}-\d{2}-\d{2}\s\d{2}:\d{2}:\d{2},\d{3,5}))

Explantion:

  • (\d{4}-\d{2}-\d{2}\s\d{2}:\d{2}:\d{2},\d{3,5}) - matches the timestamp
  • \sERROR.+?(?=\d{4}-\d{2}-\d{2}\s\d{2}:\d{2}:\d{2},\d{3,5}) - does the nongreedy match until you find the next timestamp (Positive Lookahead)
  • Also i would like to highlight that you wil have to use m option for multiline match while using this regex
  • This match will give you nested groups for every match like [[log, timestamp],[log, timestamp]]
Sign up to request clarification or add additional context in comments.

4 Comments

This worked but how can I add a timestamp as the key and error message as a value in hashmap?
That you will have to do programmatically. Every match will have two groups first is error second is timestamp. You can loop over the list and create a hashmap while looping
I tried something like this: map.put(m.group(1), m.group(2)); but key and value both contains an error message.
What could be the pattern for this type of input 2019-05-27 05:14:36.224Z|app|machine|[XNIO-2 task-11]|ERROR|533a1030-0301-407d-bdbb-569892bded45|||b3b73ff6-3689-4def-a860-9dccd77226d9||9gHJMKox|3|com.app.log.LogAspect:aroundInternal|61|Error occured com.pkg.subpkg.class.getIt(String, Set)
0

Your pattern looks off, and also you should be using a pattern in dot all mode, since the portion of the stack trace which you want to capture may span across more one than line. I suggest using the following regex pattern:

\d{4}-\d{2}-\d{2} \d{2}:\d{2}:\d{2},\d{3} ERROR.*?(?=\bat\b)

This matches a timestamp, followed by ERROR and then all content until reaching the first at.

Here is a working test script:

String input = "2019-05-27 10:49:18,418 ERROR  org.apache.flink.runtime.executiongraph.ExecutionGraph        - Job  (3064130e1dccead0b037f193d3699c3b) switched from state FAILING to FAILED.\njava.lang.IllegalArgumentException: json can not be null or empty\n    at com.jayway.jsonpath.internal.Utils.notEmpty(Utils.java:256)";
String pattern = "\\d{4}-\\d{2}-\\d{2} \\d{2}:\\d{2}:\\d{2},\\d{3} ERROR.*?(?=\\bat\\b)";
Pattern r = Pattern.compile(pattern, Pattern.DOTALL);
Matcher m = r.matcher(input);
if (m.find()) {
    System.out.println(m.group(0));
}

Output:

2019-05-27 10:49:18,418 ERROR  org.apache.flink.runtime.executiongraph.ExecutionGraph        - Job  (3064130e1dccead0b037f193d3699c3b) switched from state FAILING to FAILED.
java.lang.IllegalArgumentException: json can not be null or empty

6 Comments

It's working only for the input that u hardcoded, if you consider my file then it will print all content including INFO logs too.
Then make the dot lazy, use .*?(?=\bat\b
Tried same but now it gives single line result including INFO logs too.
@snoop I can't reproduce your observations, and my answer is working in this online Java demo.
Slightly modified the input, included INFO logs in between ERROR logs please check, now its printing INFO logs too. rextester.com/GAEL82642. Updated.
|

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.