0
STOPATDESK YES;
:: TXT "LCLLMT:29.4700";
:: TXT "LCLCURR;NON-USD";
:: TXT "CALLBK:3";
:: TXT "FFTRL:EUR-LIM;-TAP-5";

STOPATDESK YES; :: TXT "LCLLMT:29.4700"; :: TXT "LCLCURR;NON-USD"; :: TXT "CALLBK:3"; :: TXT "FFTRL:EUR-LIM;-TAP-5";

Could you please provide regex that will match semicolons but not within TXT "..."?

There were several useful questions on StackOverflow but I failed to compile working solution for my case
Regex for matching a character, but not when it's enclosed in square bracket
Regex for matching a character, but not when it's enclosed in quotes

7
  • I could not understand. Do you want to match the semicolons at the end of the lines? Commented Sep 9, 2015 at 13:21
  • there were several usefull questions on StackOverflow but I failed to compile working solution for my case Commented Sep 9, 2015 at 13:23
  • yes, I want to match semicollons but they may not be at the end of the lines Commented Sep 9, 2015 at 13:27
  • Just match? Easy: "TEXT\\s*\"[^\"]*\"|(;)" and grab .group(1). Commented Sep 9, 2015 at 13:33
  • 1
    Try using s.split("(?<!TXT \"[^\"]{0,1000});"). If the TXT "... are not longer than 1000 symbols long, that might work in this case. But I do not think a constrained witdth look-behind is that reliable. Commented Sep 9, 2015 at 13:43

2 Answers 2

2

You need a regex that matches any semicolon that is not followed by an odd number of quotes.

;(?![^"]*(([^"]*"[^"]*"){2})*[^"]*"[^"]*$)

The tricky part is to find the negative lookahead (?![^"]*(([^"]*"[^"]*"){2})*[^"]*"[^"]*$):

  • [^"]* match any text before the first " after ;
  • (([^"]*"[^"]*"){2})* match any even number of quotes with text inside
  • [^"]*"[^"]*$ match the last quote

If all the above conditions are matched, then an odd number of " is found after ;. That implies that the ; is inside two " and therefore it's not a valid ;.

Regex: https://regex101.com/r/dG6cC1/1

Java demo: https://ideone.com/OuAaA5

Sign up to request clarification or add additional context in comments.

Comments

0

You can also try with:

"[^"]*"|(;)

DEMO

which will match quotes or separate semicolons, then get separate semicolons with group(1). However the unbalanced quoting sings would couse a problem. Or, if whole file is formated as your example (semicolons in quotation are preceded and followed by another character, not whitespace), you can try with:

;(?=\s|$)

DEMO

It works with example above.

Comments

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.