30

I'm using the Reflection API in PHP to pull a DocComment (PHPDoc) string from a method

$r = new ReflectionMethod($object);
$comment = $r->getDocComment();

This will return a string that looks something like this (depending on how well the method was documented)

/**
* Does this great things
*
* @param string $thing
* @return Some_Great_Thing
*/

Are there any built-in methods or functions that can parse a PHP Doc Comment String into a data structure?

$object = some_magic_function_or_method($comment_string);

echo 'Returns a: ', $object->return;

Lacking that, what part of the PHPDoc source code should I be looking at the do this myself.

Lacking and/or in addition to that, is there third party code that's considered "better" at this that the PHPDoc code?

I realize parsing these strings isn't rocket science, or even computer science, but I'd prefer a well tested library/routine/method that's been built to deal with a lot of the janky, semi-non-correct PHP Doc code that might exist in the wild.

1
  • If you're trying to read in the @ tags and their values, then using preg_match would be the best solution. Commented Jul 15, 2012 at 15:36

12 Answers 12

22

I am surprised this wasn't mentioned yet: what about using Zend_Reflection of Zend Framework? This may come in handy especially if you work with a software built on Zend Framework like Magento.

See the Zend Framework Manual for some code examples and the API Documentation for the available methods.

There are different ways to do this:

  • Pass a file name to Zend_Reflection_File.
  • Pass an object to Zend_Reflection_Class.
  • Pass an object and a method name to Zend_Reflection_Method.
  • If you really only have the comment string at hand, you even could throw together the code for a small dummy class, save it to a temporary file and pass that file to Zend_Reflection_File.

Let's go for the simple case and assume you have an existing class you want to inspect.

The code would be like this (untested, please forgive me):

$method = new Zend_Reflection_Method($class, 'yourMethod');
$docblock = $method->getDocBlock();

if ($docBlock->hasTag('return')) {
    $tagReturn = $docBlock->getTag('return'); // $tagReturn is an instance of Zend_Reflection_Docblock_Tag_Return
    echo "Returns a: " . $tagReturn->getType() . "<br>";
    echo "Comment for return type: " . $tagReturn->getDescription();
}
Sign up to request clarification or add additional context in comments.

2 Comments

Does this work with traits? As in if a class is using traits will it pick up the correct docblock?
To be honest: I'm not sure. As far as I am aware, Zend_Reflection for ZF1 was written when PHP 5.4 and traits had not been released yet.
15
+400

You can use the "DocBlockParser" class from the Fabien Potencier Sami ("Yet Another PHP API Documentation Generator") open-source project.
First of all, get Sami from GitHub.
This is an example of how to use it:

<?php

require_once 'Sami/Parser/DocBlockParser.php';
require_once 'Sami/Parser/Node/DocBlockNode.php';

class TestClass {
    /**
     * This is the short description.
     *  
     * This is the 1st line of the long description 
     * This is the 2nd line of the long description 
     * This is the 3rd line of the long description   
     *  
     * @param bool|string $foo sometimes a boolean, sometimes a string (or, could have just used "mixed")
     * @param bool|int $bar sometimes a boolean, sometimes an int (again, could have just used "mixed") 
     * @return string de-html_entitied string (no entities at all)
     */
    public function another_test($foo, $bar) {
        return strtr($foo,array_flip(get_html_translation_table(HTML_ENTITIES)));
    }
}

use Sami\Parser\DocBlockParser;
use Sami\Parser\Node\DocBlockNode;

try {
    $method = new ReflectionMethod('TestClass', 'another_test');
    $comment = $method->getDocComment();
    if ($comment !== FALSE) {
        $dbp = new DocBlockParser();
        $doc = $dbp->parse($comment);
        echo "\n** getDesc:\n";
        print_r($doc->getDesc());
        echo "\n** getTags:\n";
        print_r($doc->getTags());
        echo "\n** getTag('param'):\n";
        print_r($doc->getTag('param'));
        echo "\n** getErrors:\n";
        print_r($doc->getErrors());
        echo "\n** getOtherTags:\n";
        print_r($doc->getOtherTags());
        echo "\n** getShortDesc:\n";
        print_r($doc->getShortDesc());
        echo "\n** getLongDesc:\n";
        print_r($doc->getLongDesc());
    }
} catch (Exception $e) {
    print_r($e);
}

?>

And here is the output of the test page:

** getDesc:
This is the short description.

This is the 1st line of the long description 
This is the 2nd line of the long description 
This is the 3rd line of the long description
** getTags:
Array
(
    [param] => Array
        (
            [0] => Array
                (
                    [0] => Array
                        (
                            [0] => Array
                                (
                                    [0] => bool
                                    [1] => 
                                )

                            [1] => Array
                                (
                                    [0] => string
                                    [1] => 
                                )

                        )

                    [1] => foo
                    [2] => sometimes a boolean, sometimes a string (or, could have just used "mixed")
                )

            [1] => Array
                (
                    [0] => Array
                        (
                            [0] => Array
                                (
                                    [0] => bool
                                    [1] => 
                                )

                            [1] => Array
                                (
                                    [0] => int
                                    [1] => 
                                )

                        )

                    [1] => bar
                    [2] => sometimes a boolean, sometimes an int (again, could have just used "mixed")
                )

        )

    [return] => Array
        (
            [0] => Array
                (
                    [0] => Array
                        (
                            [0] => Array
                                (
                                    [0] => string
                                    [1] => 
                                )

                        )

                    [1] => de-html_entitied string (no entities at all)
                )

        )

)

** getTag('param'):
Array
(
    [0] => Array
        (
            [0] => Array
                (
                    [0] => Array
                        (
                            [0] => bool
                            [1] => 
                        )

                    [1] => Array
                        (
                            [0] => string
                            [1] => 
                        )

                )

            [1] => foo
            [2] => sometimes a boolean, sometimes a string (or, could have just used "mixed")
        )

    [1] => Array
        (
            [0] => Array
                (
                    [0] => Array
                        (
                            [0] => bool
                            [1] => 
                        )

                    [1] => Array
                        (
                            [0] => int
                            [1] => 
                        )

                )

            [1] => bar
            [2] => sometimes a boolean, sometimes an int (again, could have just used "mixed")
        )

)

** getErrors:
Array
(
)

** getOtherTags:
Array
(
)

** getShortDesc:
This is the short description.
** getLongDesc:
This is the 1st line of the long description 
This is the 2nd line of the long description 
This is the 3rd line of the long description

2 Comments

This approach no longer works. It seems in 2014 they decided to nix their own parser in favor of just using the DocBlox/phpDocumentor parser.
It Works with the old version of Sami: github.com/FriendsOfPHP/Sami/tree/v1.2
8

2022 phpdoc-parser

PHPStan has now its own AST-based parser for doc blocks:

https://github.com/phpstan/phpdoc-parser

  • it's maintained for longterm
  • it allows node traversing
  • it can resolve fqn
  • it has format preserving printer

Here is how you can modify it with custom node visitor.

Comments

6

You could use DocBlox (http://github.com/mvriel/docblox) to generate a XML data structure for you; you can install DocBlox using PEAR and then run the command:

docblox parse -d [FOLDER] -t [TARGET_LOCATION]

This will generate a file called structure.xml which contains all meta data about your source code, including parsed docblocks.

OR

You can use the DocBlox_Reflection_DocBlock* classes to directly parse a piece of DocBlock text.

This you can do by making sure you have autoloading enabled (or include all DocBlox_Reflection_DocBlock* files) and execute the following:

$parsed = new DocBlox_Reflection_DocBlock($docblock);

Afterwards you can use the getters to extract the information that you want.

Note: you do not need to remove the asterisks; the Reflection class takes care of this.

4 Comments

DocBlox is dead. Seems to have been merged into this github.com/phpDocumentor/phpDocumentor2
Indeed, phpDocumentor and DocBlox have merged together to form phpDocumentor2
Just noticed you're the author. Sneaky guy.
@mvriel, DocBlox_Reflection_DocBlock is deprecated as you point out, how can we use phpDocumentor2 api for the same purpose?
5

Check out

http://pecl.php.net/package/docblock

The docblock_tokenize() function will get you part-way there, I think.

Comments

4

You can always view the source from phpDoc. The code is under LGPL so if you do decide to copy it you would need to license your software under the same license AND properly add the correct notices.

EDIT: Unless, as @Samuel Herzog, noted you use it as a library.

Thanks @Samuel Herzog for the clarification.

2 Comments

as long as you're using the phpDoc part only as library its perfectly fine to use your own license model. licence information was neither required nor correct.
Apologies if it wasn't clear from my question, but I know I can use the PHPDoc source code. I'm hoping someone here can provide the exact code I can use to do this to save me glopping through an unfamiliar source tree.
4

I suggest addendum, its pretty cool and well alive and used in many php5 frameworks...

http://code.google.com/p/addendum/

Check the tests for examples

http://code.google.com/p/addendum/source/browse/trunk#trunk%2Fannotations%2Ftests

Comments

4

phpdoc-parser (https://github.com/phpstan/phpdoc-parser) is probably the most modern and flexible way to parse phpdoc (as Tomas Votruba says), unfortunately there is not much documentation, here is a simple way to start:

<?php

use PHPStan\PhpDocParser\Lexer\Lexer;
use PHPStan\PhpDocParser\Parser\ConstExprParser;
use PHPStan\PhpDocParser\Parser\PhpDocParser;
use PHPStan\PhpDocParser\Parser\TokenIterator;
use PHPStan\PhpDocParser\Parser\TypeParser;

require 'vendor/autoload.php';

$comment = <<<'PHP'
/**
 * @param int $foo
 * @return string|false
 */
PHP;

$phpDocLexer = new Lexer();
$constantExpressionParser = new ConstExprParser();
$phpDocParser = new PhpDocParser(new TypeParser($constantExpressionParser), $constantExpressionParser);

$tokens = new TokenIterator($phpDocLexer->tokenize($comment));

$ast = $phpDocParser->parse($tokens);

print_r($ast);

// For example to get the type of the first param
// $ast->getTagsByName('@param')[0]->value->type->name; // int

Comments

1

I suggest you to take a look at http://code.google.com/p/php-annotations/

The code is fairly simple to be modified/understood if needed.

Comments

1

As pointed out in one of the answers above, you can use phpDocumentor. If you use composer, then just add "phpdocumentor/reflection-docblock": "~2.0" to your "require" block.

See this for an example: https://github.com/abdulla16/decoupled-app/blob/master/composer.json

For usage examples, see: https://github.com/abdulla16/decoupled-app/blob/master/Container/Container.php

Comments

0

Updated version of user1419445's code. The DocBlockParser::parse() method is changed and needs a second context parameter. It also seems to be slightly coupled with phpDocumentor, so for the sake of simplicity I would assume you have Sami installed via Composer. The code below works for Sami v4.0.16

<?php

require_once 'vendor/autoload.php';

class TestClass {
    /**
     * This is the short description.
     *  
     * This is the 1st line of the long description 
     * This is the 2nd line of the long description 
     * This is the 3rd line of the long description   
     *  
     * @param bool|string $foo sometimes a boolean, sometimes a string (or, could have just used "mixed")
     * @param bool|int $bar sometimes a boolean, sometimes an int (again, could have just used "mixed") 
     * @return string de-html_entitied string (no entities at all)
     */
    public function another_test($foo, $bar) {
        return strtr($foo,array_flip(get_html_translation_table(HTML_ENTITIES)));
    }
}

use Sami\Parser\DocBlockParser;
use Sami\Parser\Filter\PublicFilter;
use Sami\Parser\ParserContext;

try {
    $method = new ReflectionMethod('TestClass', 'another_test');
    $comment = $method->getDocComment();
    if ($comment !== FALSE) {
        $dbp = new DocBlockParser();
        $filter = new PublicFilter;
        $context = new ParserContext($filter, $dbp, NULL);
        $doc = $dbp->parse($comment, $context);
        echo "\n** getDesc:\n";
        print_r($doc->getDesc());
        echo "\n** getTags:\n";
        print_r($doc->getTags());
        echo "\n** getTag('param'):\n";
        print_r($doc->getTag('param'));
        echo "\n** getErrors:\n";
        print_r($doc->getErrors());
        echo "\n** getOtherTags:\n";
        print_r($doc->getOtherTags());
        echo "\n** getShortDesc:\n";
        print_r($doc->getShortDesc());
        echo "\n** getLongDesc:\n";
        print_r($doc->getLongDesc());
    }
} catch (Exception $e) {
    print_r($e);
}

?>

Comments

0

Have a look at the Php Comment Manager package. It allows parsing method DocBloc comments. It uses Php Reflection API for fetching the DocBloc comments of methods

Comments

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.