1

I have a below string which comes from an excel column

  "\"USE CODE \"\"Gef, sdf\"\" FROM 1/7/07\""

I would like to set regex pattern to retrieve the entire string,so that my result would be exactly like

"USE CODE ""Gef, sdf"" FROM 1/7/07"

Below is what I tried

import java.util.regex.Matcher;
import java.util.regex.Pattern;

public class RegexMatches
{
    public static void main( String args[] ){

      // String to be scanned to find the pattern.
      String line = "\"USE CODE \"\"Gef, sdf\"\" FROM 1/7/07\", Delete , Hello , How are you ? , ";
      String line2 = "Test asda ds asd, tesat2 . test3";

      String dpattern = "(\"[^\"]*\")(?:,(\"[^\"]*\"))*,|([^,]+),";
      // Create a Pattern object
      Pattern d = Pattern.compile(dpattern);
      Matcher md = d.matcher(line2);

      Pattern r = Pattern.compile(dpattern);

      // Now create matcher object.
      Matcher m = r.matcher(line);
      if (m.find( )) {
         System.out.println("Found value: 0 " + m.group(0) );
       //  System.out.println("Found value: 1 " + m.group(1) );
         //System.out.println("Found value: 2 " + m.group(2) );
      } else {
         System.out.println("NO MATCH");
      }
   }
}

and the result out of it breaks after ,(comma) and hence the output is

Found value: 0 "USE CODE ""Gef,

It should be

Found value: 0 "USE CODE ""Gef sdf"" FROM 1/7/07",

and for the second line Matcher m = r.matcher(line2); the output should be

Found value: 0 "Test asda ds asd",
8
  • Note: In the real scenario,I cannot manipulate on the given string like removing the quotes and then apply regex. All I can do is to get it via Regex pattern .So this java is to test the regex pattern . Commented Jun 15, 2016 at 8:22
  • im not sure what your trying to do with the non capturing group ?: and the OR operator |, but it looks like you could just use (\"[^\"]*\"){3} and it would match 3 quoted groups. Commented Jun 15, 2016 at 8:22
  • Do you mean you need to get the double quoted substring only? Like "[^"]*(?:""[^"]*)*"? What is the expected output for the line1 and line 2? Maybe (?:"[^"]*(?:""[^"]*)*"|[^,])+ might do? Commented Jun 15, 2016 at 8:25
  • Why not select simply everything between two \"? like \".*\". You can do a pre computation check of validity string assuring that \" is a pair number in order to avoid false positives. Commented Jun 15, 2016 at 8:25
  • 1
    See ideone.com/1AOvhZ - I just modified it to return the first match. Does it work for you? Commented Jun 15, 2016 at 8:32

1 Answer 1

3

You may use

(?:"[^"]*(?:""[^"]*)*"|[^,])+

See the regex demo

Explanation:

  • " - leading quote
  • [^"]* - 0+ chars other than a double quote
  • (?:""[^"]*)* - 0+ sequences of a "" text followed with 0+ chars other than a double quote
  • " - trailing quote

OR:

  • [^,] - any char but a comma

And the whole pattern is matched 1 or more times as it is enclosed with (?:...)+ and + matches 1 or more occurrences.

IDEONE demo:

String line = "\"USE CODE \"\"Gef, sdf\"\" FROM 1/7/07\", Delete , Hello , How are you ? , ";
String line2 = "Test asda ds asd, tesat2 . test3";
Pattern pattern = Pattern.compile("(?:\"[^\"]*(?:\"\"[^\"]*)*\"|[^,])+");
Matcher matcher = pattern.matcher(line);
if (matcher.find()){                        // if is used to get the 1st match only
    System.out.println(matcher.group(0)); 
}
Matcher matcher2 = pattern.matcher(line2); 
if (matcher2.find()){
    System.out.println(matcher2.group(0)); 
} 
Sign up to request clarification or add additional context in comments.

Comments

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.