8

I am experimenting with a tool that does static analysis. The tool works on bytecode rather than source code. (However, I have the source code as well).

The tool outputs some line numbers from the bytecode and now I need an easy way to map back to source code. Netbeans/Eclipse do this all the time (when I click on a method in an included library, the IDE takes me to the source (if it is available)), so I know this is possible. I just could not figure out a way to do it.

As an example, consider the following hello world program:

package mypackage;
import java.io.*;
class MyMainClass {
  public static void main(String[] args) { 
    BufferedReader in = new BufferedReader(new InputStreamReader(System.in)); 
    String name0 = "Alice";
    String name1 = "Bob";
    try {
        name0 = in.readLine();
    }
    catch(Exception e) {
        System.out.println("Caught an exception!"); 
    }       
    System.out.println("Hello " + name0 + "!"); 
    System.out.println("Hello " + name1 + "!"); 
  }
}

The generated bytecode (obtained from javap) is:

Compiled from "MyMainClass.java"
class mypackage.MyMainClass extends java.lang.Object{
mypackage.MyMainClass();
  Code:
   0:   aload_0
   1:   invokespecial   #1; //Method java/lang/Object."<init>":()V
   4:   return

public static void main(java.lang.String[]);
  Code:
   0:   new     #2; //class java/io/BufferedReader
   3:   dup
   4:   new     #3; //class java/io/InputStreamReader
   7:   dup
   8:   getstatic       #4; //Field java/lang/System.in:Ljava/io/InputStream;
   11:  invokespecial   #5; //Method java/io/InputStreamReader."<init>":(Ljava/io/InputStream;)V
   14:  invokespecial   #6; //Method java/io/BufferedReader."<init>":(Ljava/io/Reader;)V
   17:  astore_1
   18:  ldc     #7; //String Alice
   20:  astore_2
   21:  ldc     #8; //String Bob
   23:  astore_3
   24:  aload_1
   25:  invokevirtual   #9; //Method java/io/BufferedReader.readLine:()Ljava/lang/String;
   28:  astore_2
   29:  goto    42
   32:  astore  4
   34:  getstatic       #11; //Field java/lang/System.out:Ljava/io/PrintStream;
   37:  ldc     #12; //String Caught an exception!
   39:  invokevirtual   #13; //Method java/io/PrintStream.println:(Ljava/lang/String;)V
   42:  getstatic       #11; //Field java/lang/System.out:Ljava/io/PrintStream;
   45:  new     #14; //class java/lang/StringBuilder
   48:  dup
   49:  invokespecial   #15; //Method java/lang/StringBuilder."<init>":()V
   52:  ldc     #16; //String Hello
   54:  invokevirtual   #17; //Method java/lang/StringBuilder.append:(Ljava/lang/String;)Ljava/lang/StringBuilder;
   57:  aload_2
   58:  invokevirtual   #17; //Method java/lang/StringBuilder.append:(Ljava/lang/String;)Ljava/lang/StringBuilder;
   61:  ldc     #18; //String !
   63:  invokevirtual   #17; //Method java/lang/StringBuilder.append:(Ljava/lang/String;)Ljava/lang/StringBuilder;
   66:  invokevirtual   #19; //Method java/lang/StringBuilder.toString:()Ljava/lang/String;
   69:  invokevirtual   #13; //Method java/io/PrintStream.println:(Ljava/lang/String;)V
   72:  getstatic       #11; //Field java/lang/System.out:Ljava/io/PrintStream;
   75:  new     #14; //class java/lang/StringBuilder
   78:  dup
   79:  invokespecial   #15; //Method java/lang/StringBuilder."<init>":()V
   82:  ldc     #16; //String Hello
   84:  invokevirtual   #17; //Method java/lang/StringBuilder.append:(Ljava/lang/String;)Ljava/lang/StringBuilder;
   87:  aload_3
   88:  invokevirtual   #17; //Method java/lang/StringBuilder.append:(Ljava/lang/String;)Ljava/lang/StringBuilder;
   91:  ldc     #18; //String !
   93:  invokevirtual   #17; //Method java/lang/StringBuilder.append:(Ljava/lang/String;)Ljava/lang/StringBuilder;
   96:  invokevirtual   #19; //Method java/lang/StringBuilder.toString:()Ljava/lang/String;
   99:  invokevirtual   #13; //Method java/io/PrintStream.println:(Ljava/lang/String;)V
   102: return
  Exception table:
   from   to  target type
    24    29    32   Class java/lang/Exception


}

The output of the tool is something like:

<mypackage.MyMainClass> 39, 69, 99

These correspond to the bytecode line numbers. Manually I can figure out that the lines must correspond to the following lines in the source code:

System.out.println("Caught an exception!"); 
System.out.println("Hello " + name0 + "!"); 
System.out.println("Hello " + name1 + "!"); 

However, I need to automate this process. Any help would be appreciated.

1
  • You can use data decompiler for that Commented May 21, 2012 at 7:19

4 Answers 4

3

If you have access to both the source file and the appropriate line number, the task should be as simple as loading the file on a per-line basis and merely choosing the one corresponding to that line number.

The problem with your approach is that it relies on optional metadata stored in the compiled code as per the class file format, in the form of attributes. The two in question, namely SourceFile and LineNumberTable are both optional, meaning there is no guarantee they will be present in your code. Verify that the class files you're analyzing are compiled to contain this info!

Note: These same attributes are used for providing the information for stack traces via Throwable.getStackTrace, in case you were wondering.

Sign up to request clarification or add additional context in comments.

1 Comment

I don't think relying on SourceFile and LineNumberTable is a problem. If you compile your class file with no debug information, you'll get ... well, no debug information.
1

Another option to consider is taking advantage of existing static analysis tools, like FindBugs, which have lots of function to show line numbers for flagged sections of code, as well as producing nice HTML reports. You can extend FindBugs with your own bytecode analysis relatively easily. I have followed the tutorial here with success: http://www.ibm.com/developerworks/library/j-findbug2/

Comments

1

Soot does this for you. You will run your class file through Soot, which converts it to Jimple, then you can ask for the ValueBox of any line of code, and there is a method: .getJavaSourceStartLineNumber() and .getJavaSourceStartColumnNumber() which, according to their documentation, return the line number or column number of the source code.

That is, by and far, the easiest way to do this. Soot is a Static analysis tool built explicitly for Java.

Comments

-1

You can use JD for your purpose. Once you have the tool, you can decompile through the prompt using:

jdi-gui.exe YourClassFile.class.

You can start a subprocess using the Process class, decompile your .class file and then re write a new file with respective line numbers. Then, only select those line numbers that are returned by your tool.

1 Comment

I already have the source, so I don't want to use JD. It is possible that the code from JD will not match the original source.

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.