0

I have file with content:

~LayerData
type="waypointlist"
type="waypointlistend"
type="track" name="Track1" color=#695cbb
type="trackpoint" latitude="43.5032064" longitude="16.4266248"
type="trackpoint" latitude="43.5071074767561" longitude="16.48329290000057"
type="trackend"
~EndLayerData
~LayerData
type="waypointlist"
type="waypointlistend"
type="track" name="Track2" color=#000000
type="trackpoint" latitude="43.51037193515589" longitude="16.491883500895977"
type="trackpoint" latitude="43.521582832754135" longitude="16.473187288140295"
type="trackend"
~EndLayerData

I'm extracing LayerData -> EndLayerData matches using:

Pattern p = Pattern.compile("(~LayerData(.|\n)*~EndLayerData)");
Matcher m = p.matcher(s);

As a result I get m.group() with three items: first two are identical and contain the complete file. Last one is "\n". I expected to receive Track1 and Track2 separated.

4
  • What exactly are you trying to extract here? Commented Jul 28, 2020 at 6:31
  • Remember to escape properly. a single ` in a string becomes \`. \n is a line break, if you want to match it you need to use \\n. Commented Jul 28, 2020 at 7:05
  • @TimBiegeleisen: I'm trying to extract two LayerData sections Commented Jul 28, 2020 at 7:31
  • @Polygnome: This is copy-paste from IDE, there is \\ in source code Commented Jul 28, 2020 at 7:31

3 Answers 3

1

You could match LayerData followed by all lines that do not start with either LayerData or EndLayerData using a negative lookahead.

^~LayerData(?:\R(?!~(?:End)?LayerData).*)*\R~EndLayerData

Explanation

  • ^~LayerData Match LayerData from the start of the string
  • (?: Non capture group
    • \R(?!~(?:End)?LayerData) Match a newline, assert what is directly to the right is not EndLayerData or LayerData
    • .* Match the rest of the line
  • )* Close the group and repeat 0+ times to get all lines
  • \R~EndLayerData Match a newline and EndLayerData

In Java with double escaped backslashes:

String regex = "^~LayerData(?:\\R(?!~(?:End)?LayerData).*)*\\R~EndLayerData";

Regex demo | Java demo

Example code

String regex = "^~LayerData(?:\\R(?!~(?:End)?LayerData).*)*\\R~EndLayerData";
String string = "...";

Pattern pattern = Pattern.compile(regex, Pattern.MULTILINE);
Matcher matcher = pattern.matcher(string);

while (matcher.find()) {
    System.out.println(matcher.group(0));
}
Sign up to request clarification or add additional context in comments.

Comments

0

Try this pattern

(~LayerData(.|\n)*?~EndLayerData)

1 Comment

Thanks, but same result. The problem is that EndLayerData LayerData in the middle of file is not detected, not the LayerData content itself.
0

Update: Use Code Generator under tools in regex101 to get language-specific regex.

String regex = "\\~LayerData(.|\\n)*?\\~EndLayerData";
Pattern pattern = Pattern.compile(regex); 
Matcher matcher = pattern.matcher(string); 
while (matcher.find()) { 
System.out.println(matcher.group(0)); 
}

Earlier Answer: You are not getting the match properly as the regex you are using is not proper. Since it matches with everything that starts with "~LayerData" and ends with "~EndLayerData", the whole file is getting matched. Creating an appropriate regex using regex101.com (helps in visualizing) and using that should fix the issue.

3 Comments

This is the regex: regex101.com/r/JG3W09/1, but the problem here is setting the /g flag, which is enabled by default in Java, but the result is different, the on regex101.com
Not works, please take a look at online Java test: tpcg.io/cxW9uGRz
@ernest Works fine. Please take a look: tpcg.io/W3B7f1kd

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.