3

I have two string like below..I want to split those string on space but ignore space if space found inside quotes....

LA    L'TL0BPC,C'ABC  '   THIS IS COMMENT
LA    C'TL0PC',C'ABC  '   THIS IS COMMENT
MVC   EBW000(4),=C'MPI '  THIS IS ANOTHER' CASE

I want to split those lines like this

LA L'TL0BPC,C'ABC ' THIS IS COMMENT

LA C'TL0PC',C'ABC ' THIS IS COMMENT

How to achieve this using java regex....Any other solution is also acceptable..

I have tried this:

String ODD_QT_REGEX="[ ]+(?=([^'']*'[^'']*')*[^'']*)"; 
String EVEN_QT_REGEX="[ ]+(?=([^'']*'[^'']*')*[^'']*$)"; 

but this doesn't do what I need.

3
  • 1
    Have you tried anything yet? Commented Mar 6, 2015 at 14:26
  • String ODD_QT_REGEX="[ ]+(?=([^'']*'[^'']*')*[^'']*)"; String EVEN_QT_REGEX="[ ]+(?=([^'']*'[^'']*')*[^'']*$)"; I tried with this two ...but its failing...... Commented Mar 6, 2015 at 14:28
  • Why did you post it as comment? Your attempts should be part of your question. So use edit option to place it there. Commented Mar 6, 2015 at 14:33

1 Answer 1

2

You could do matching instead of splitting. Splitting according to this "[ ]+(?=([^'']*'[^'']*')*[^'']*)"; regex is possible only if your input has balanced quotes.

Seems like i figured out the problem. Same like the op's regex but this regex won't consider an apostrophe as a single quote. The below regex would match one or more space characters which is followed by

  • \b'\b An apostrophe.
  • | OR
  • '[^']' single quote block.
  • | OR
  • [^'] Any character but not of single quote.
  • (?:\\b'\\b|'[^']*'|[^'])*, zero or more times. Then it must be followed by an end of the line anchor.

Code:

String r = "LA    L'TL0BPC,C'ABC  '  THIS IS COMMENT";
String[] m = r.split("\\s+(?=(?:\\b'\\b|'[^']*'|[^'])*$)");
System.out.println(Arrays.toString(m));

OR

For more exact case, you could replace \b in the above regex with lookarounds.

"\\s+(?=(?:(?<=[a-zA-Z])'(?=[A-Za-z])|'[^']*'|[^'])*$)"

Output:

[LA, L'TL0BPC,C'ABC  ', THIS, IS, COMMENT]
Sign up to request clarification or add additional context in comments.

6 Comments

The unbalanced quote threw me off!
Doing some testing.....If passed i will accept your ans...and +1 for explanation...
EMPAS870 EQU * >>> TOP OF 'RESTACK THE INELIGIBLES LOOP..Any correction possible for this line???its not splitting on space before quote...
no, because there is only one single quote present and it's not an apostrophe.
MVC EBW000(4),=C'MPI ' TELL EMQ4 TO PROCESS COMP NRFN'S this one also failing...Plz help.....
|

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.