2

I am trying to extract all substrings in a string that is between the strings /* and */. I know this will probably need to be done with regular expressions however I'm having a hard time getting the correct regex since the star character is actually used to symbolise repeated characters. I'm am trying to use the preg-match method in PHP here is what I have come up with so far but I'm not having much luck.

<?php
   $aString = "abcdef/*ghij*/klmn/*opqrs*/tuvwxyz";
   preg_match("/*/.*/", $aString, $anArray);

   for ($i = 0; $i < count($anArray); i++)
      echo $anArray[i] . "\n";
?>
4
  • You don't happen to be parsing comment blocks out of PHP source code with this? Commented Aug 14, 2010 at 11:50
  • I'm trying to build a php code formatter to display php code using HTML Commented Aug 14, 2010 at 11:52
  • Did you try your code before posting it here? There are basic mistakes like: i -> $i which php parser should report. Commented Aug 14, 2010 at 11:55
  • 4
    @jazzdawg: In that case use token_get_all. Commented Aug 14, 2010 at 11:56

6 Answers 6

1

To extract comment sections out of PHP code, use the Tokenizer.

token_get_all() will parse the code, and return an array of elements.

Comments will be represented as T_COMMENT elements.

This has the great advantage of catching all possible ways of having comments in PHP code:

/* This way, */

// This way

# and this way
Sign up to request clarification or add additional context in comments.

2 Comments

Thanks I'll have a look into that
More important: It won’t give you false positives like in $str = "/* foo */";
0

Working code:

 $aString = "abcdef/*ghij*/klmn/*opqrs*/tuvwxyz";

 // SIMPLE VERSION WHERE ASTERISK MAY NOT BE IN THE COMMENT
 // \/\* is just escape sequence for /*  
 // [^\*]* - in comment may be whatever except * (asterisk)
 // \*\/ is escape sequence for */
 preg_match_all("#\/\*[^\*]*\*\/#", $aString, $anArray);

 // BETTER VERSION 
 // http://www.regular-expressions.info/refadv.html - for explanation of ?: and ?!  
 preg_match_all("#\/\*" . "((?:(?!\*\/).)*)" . "\*\/#", $aString, $anArray);


 var_dump($anArray); // easier for debugging than for-loop

Output for better version:

array(2) {
  [0]=>
  array(2) {
    [0]=>
    string(8) "/*ghij*/"
    [1]=>
    string(9) "/*opqrs*/"
  }
  [1]=>
  array(2) {
    [0]=>
    string(4) "ghij"
    [1]=>
    string(5) "opqrs"
  }
}

4 Comments

If I put a star somewhere in there (ie: /*gh*ij*) it will fail
@NullUserException: Yes, I was aware of that and I've added new version which should work better.
Why are you escaping the forward slash with \/?
@NullUserException: I simply don't remember which characters need escaping. :-[
0

Escape the * to use it, and ad parentheses to capture the content like that : /\*(.*)\*/, and you should use preg_match_all to find all matches in your string.

(and easier than a for, use var_dump($anArray))

Comments

0
$aString = "abcdef/*ghij*/klmn/*opqrs*/tuvwxyz";
preg_match_all("/\/\*(.*?)\*\//", $aString, $anArray,PREG_SET_ORDER);
var_dump($anArray);

Comments

0

If (as you say in one of the comments) you're attempting to display PHP code in HTML there's actually a built-in function (highlight_file) that does precisely this.

Free free to ignore if you're using this as a learning exercise, etc. :-)

Comments

0

I think the regex will be simple as

\/\*.*?\*\/

here's demo of a working code using regular expression tester

http://liveregex.com/WoDbk

Comments

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.