4

I'm trying to pull code comment blocks out of JavaScript files. I'm making a light code documentator.

An example would be:

/** @Method: setSize
 * @Description: setSize DESCRIPTION
 * @param: setSize PARAMETER
 */

I need to pull out the comments setup like this, ideally into an array.

I had gotten as far as this, but realize it may not handle new lines tabs, etc.:

\/\*\*(.*?)\*\/

(Okay, this seems like it would be simple, but I'm going in circles trying to get it to work.)

1
  • 2
    I am not sure that regexp is the best tool to use for this one as you're dealing with multiple lines and parsing logic depends on whether it's the first/last/middle line... Commented Feb 13, 2012 at 15:09

3 Answers 3

5

Depending on what you want to continue doing with the extracted docblocks, multiple approaches come to mind. If you simply need the docblocks without further references, String.match() may suffice. Otherwise you might need the index of the block.

As others have already pointed out, javascript's RegEx machine is everything but powerful. if you're used to PCRE, this feels like working with your hands tied behind your back. [\s\S] (space-character, non-space-character) is equivalent to dotAll - also capturing linebreaks.

This should get you started:

var string = 'var foo = "bar";'
    + '\n\n'
    + '/** @Method: setSize'
    + '\n * @Description: setSize DESCRIPTION'
    + '\n * @param: setSize PARAMETER'
    + '\n */'
    + '\n'
    + 'function setSize(setSize) { return true; }'
    + '\n\n'
    + '/** @Method: foo'
    + '\n * @Description: foo DESCRIPTION'
    + '\n * @param: bar PARAMETER'
    + '\n */'
    + '\n'
    + 'function foo(bar) { return true; }';

var docblock = /\/\*{2}([\s\S]+?)\*\//g,
    trim = function(string){ 
        return string.replace(/^\s+|\s+$/g, ''); 
    },
    split = function(string) {
        return string.split(/[\r\n]\s*\*\s+/);
    };

// extract all doc-blocks
console.log(string.match(docblock));

// extract all doc-blocks with access to character-index
var match;
while (match = docblock.exec(string)) {
    console.log(
        match.index + " characters from the beginning, found: ", 
        trim(match[1]), 
        split(match[1])
    );
}
Sign up to request clarification or add additional context in comments.

1 Comment

thanks gents for your help! Just awesome. Thats why I love this board.
1

This should grab a comment block \/\*\*[^/]+\/. I don't think Regexp is the best way to generate an array from these blocks though. This regexp basically says:

Find a /** (the asterisk and forward slashes are escaped with \)

then find anything that isn't a /

then find one /

It's crude but is should generally work. Here's a live example http://regexr.com?300c6

3 Comments

A better way to find the end is to use the non-greedy pattern .*?\*\/. The first part (.*?) matches anything, but gets the shortest pattern that matches. Then \*\/ matches the end of the comment.
@mcrumley That is a little cleaner, although you need the dotall flag enabled otherwise the .*? doesn't match return characters. I don't think javascript supports the dotall flag.
@mcrumley This question confirms that the dotall flag isn't supported in javascript but suggests a workaround with [\s\S]*? stackoverflow.com/questions/1068280/…
0

What about some magic :)

comment.replace(/@(\w+)\s*\:\s*(\S+)\s+(\w+)/gim, function (match, tag, name, descr) {
    console.log(arguments);
    // Do sth. ...
});

I've not tested this so for the regex there is no guarantee, just to point you to a possibility do some RegExp-search the John Resig way 8-)

Comments

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.