0

I'm trying to get docblocks preceding certain function calls in a PHP file. The difference to the usual approach is that I'm not trying to parse docblocks of function definitions.

Example file:

<?php
$data = get_data($id);

if ( empty( $data->random ) ) {
  /**
  * Brief description
  *
  * @since 1.0
  * @param int $var Variable
  */
  do_function( 'identifier', $var );
  exit;
}

// random comment
$status = get_random_function($post);
?>

do_function does appear on various places in various files I'm going to parse. What I'm trying to get and parse is the preceding docblock including the function call.

A Reflection class is not an option as the files don't include classes, so I'm stuck with the following RegExp which returns an empty array:

preg_match_all('/(\/\*.+\*\/)[\s]{0,}do_function/m', $filecontent_as_string, $results);

What am I doing wrong here? Thanks!

1
  • 1
    Use something like an annotation parser to parse the annotations? Commented Sep 9, 2013 at 19:10

2 Answers 2

2

Check out Tokenizer or Reflection for this case. You may also see file in which you could use to match those certain lines of comments and have it return an array of lines.

If you desire a regular expression in this case, this should do what you want.

/(\/\*(?:[^*]|\n|(?:\*(?:[^\/]|\n)))*\*\/)\s+do_function/

See a demo in action here

Regular expression:

(                     group and capture to \1:
 \/                   match '/'
 \*                   match '*'
 (?:                  group, but do not capture (0 or more times)
   [^*]   |           any character except: '*' OR
   \n     |           any character of: '\n' (newline) OR
   (?:                group, but do not capture:
     \*               match '*'
     (?:              group, but do not capture:
       [^\/] |        any character except: '/' OR
       \n             any character of: '\n' (newline)
     )                end of grouping
   )                  end of grouping
  )*                  end of grouping
  \*                  match '*'
   \/                 match '/'
)                     end of \1
 \s+                  whitespace (\n, \r, \t, \f, and " ") (1 or more times)
 do_function          'do_function'
Sign up to request clarification or add additional context in comments.

Comments

1

You can have a much simpler regex with the following:

#(?s)(/\*(?:(?!\*/).)+\*/)\s+do_function#

regex101 demo

(?s) can be set as flag (#(/\*(?:(?!\*/).)+\*/)\s+do_function#s) and makes the . match newlines.

/\* matches the beginning of the docblock.

(?:(?!\*/).)+ matches every character except */.

\*/ matches the end of the docblock.

\s+do_function matches spaces and newlines until the do_function is found.

2 Comments

I almost used a negative look ahead. ;)
@hwnd Yea, it's a little trick I learned while browsing SO =P Yours works just fine otherwise ^^

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.