Static code parser for Java source code to extract methods / comments

Question

I'm looking for a parser that can extract methods from a java class (static source code -> .java file) and method signature, comments / documentation, variables of each of the methods. Preferably in Java programming language.

Could someone please advise?

Thanks.

Extract to what format? You might just run the Javadoc tool and deal with the generated HTML. — Jim Garrison
– Jim Garrison, Commented Jun 16, 2012 at 3:55
To me APIs if they are available, would be more useful, getMethodNames, getComments, getDocumentation etc. Also, Javadoc wouldn't give me the private variables declared inside methods, names of methods called from the method etc. — Manish Mulani
– Manish Mulani, Commented Jun 16, 2012 at 4:01
I'm still researching on this. ASTParser and Doclet combination could solve I guess.. Any help would be appreciated. — Manish Mulani
– Manish Mulani, Commented Jun 16, 2012 at 4:50

Suraj Chandran · Accepted Answer · 2012-06-16 05:12:18Z

8

You can use ASTParser by eclipse. Its super simple to use.

Find a quick standalone example here.

answered Jun 16, 2012 at 5:12

Suraj Chandran

24.9k12 gold badges67 silver badges96 bronze badges

Sign up to request clarification or add additional context in comments.

4 Comments

Manish Mulani Over a year ago

Yeah I was checking out that. Is there a way to get "comments" inside methods using ASTParser?

Suraj Chandran Over a year ago

ASTParser does declare a node called "Comment". You can just read its javadoc

Manish Mulani Over a year ago

CompilationUnit gives commentList, but I'm not able to get actual comments. for eg : if the comment is // hello world, it outputs just //

Manish Mulani Over a year ago

Overall - programcreek.com/2011/01/… was more useful.

Assaf Gamliel · Accepted Answer · 2013-02-17 11:29:32Z

5

Here is what I do to extract the method signatures from a java file/s:

I use Sublime Text 2, to the file I want to get the signatures from and the do a find Ctrl+F with regular expression set for the following Regex I made (I tested it on my code and it works, I hope it will work for you too)

((synchronized +)?(public|private|protected) +(static [a-Z\[\]]+|[a-Z\[\]]+) [a-Z]+\([a-Z ,\[\]]*\)\n?[a-Z ,\t\n]*\{)

After Sublime Text 2 highlight my results I click on "Find All" then copy Ctrl+C, open a new tab Ctrl+N and paste Ctrl+V.
You will then see all your methods signatures.

I hope it helped.

edited Feb 17, 2013 at 11:29

answered Dec 26, 2012 at 9:54

Assaf Gamliel

12k5 gold badges44 silver badges57 bronze badges

Comments

Ira Baxter · Accepted Answer · 2012-06-19 01:01:17Z

If all you want is the exact text of each method, and the exact text of the variables inside methods, you could get by with a parser that produces a CST, walking the CST to find the right nodes, and then prettyprinting the found subtrees. ANTLR has a Java parser that would work for this. I don't know if it will capture comments. I think the main distribution of ANTLR is coded in Java.

You can likely do this more hackily, in Java, with a lexer for Java, implementing what amounts to a bad island parser that looks for the key phrases. ("After 'class', find '{' and print out everything you find up to the matching '}'" would give you all the methods and fields).

If you want more precise detail (e.g, you want to know the actual type of an argument rather than just its name, or where the type is actually defined) you'll need a parser with a full front end and name resolution. (ANTLR won't do this.) The Eclipse JDT certainly builds trees; it likely does name resolution. Our DMS Software Reengineering Toolkit with its Java Front End can provide everything necessary for this task, including comment capture and extraction. DMS isn't coded in Java.

You objected to Javadoc as being inadequate, because it doesn't give you the content of methods. Perhaps our Java Source Browser, which does give you that code, would serve better. It integrates name resolution data from our DMS/Java Front End to hyperlink JavaDoc-type information into browsable source text; all fields as well as local variables are explicitly indexed. The Source Browser isn't coded in Java, but then presumably you simply want to run it and scrape your result. Such scraping might be harder than it appears staring at the screen; there's a lot of HTML behind such a display.

Collectives™ on Stack Overflow

Static code parser for Java source code to extract methods / comments

3 Answers 3

4 Comments

Comments

Comments

Your Answer

Hot Network Questions

Collectives™ on Stack Overflow

3 Answers 3

4 Comments

Comments

Comments

Your Answer

Sign up or log in

Post as a guest

Related