12

I'm working on a utility which needs to resolve hex addresses to a symbolic function name and source code line number within a binary. The utility will run on Linux on x86, though the binaries it analyzes will be for a MIPS-based embedded system. The MIPS binaries are in ELF format, using DWARF for the symbolic debugging information.

I'm currently planning to fork objdump, passing in a list of hex addresses and parsing the output to get function names and source line numbers. I have compiled an objdump with support for MIPS binaries, and it is working.

I'd prefer to have a package allowing me to look things up natively from the Python code without forking another process. I can find no mention of libdwarf, libelf, or libbfd on python.org, nor any mention of python on dwarfstd.org.

Is there a suitable module available somewhere?

6 Answers 6

9

You might be interested in the DWARF library from pydevtools:

>>> from bintools.dwarf import DWARF
>>> dwarf = DWARF('test/test')
>>> dwarf.get_loc_by_addr(0x8048475)
('/home/emilmont/Workspace/dbg/test/main.c', 36, 0)
Sign up to request clarification or add additional context in comments.

Comments

5

Please check pyelftools - a new pure Python library meant to do this.

Comments

4

You should give Construct a try. It is very useful to parse binary data into python objects.

There is even an example for the ELF32 file format.

2 Comments

I'm looking for something similar and checked out Construct. What's there is quite nice, but the project hasn't been updated in quite some time.
Just had a look at Construct, and it seems really terrific. Very impressed.
3

I don't know of any, but if all else fails you could use ctypes to directly use libdwarf, libelf or libbfd.

Comments

3

I've been developing a DWARF parser using Construct. Currently fairly rough, and parsing is slow. But I thought I should at least let you know. It may suit your needs, with a bit of work.

I've got the code in Mercurial, hosted at bitbucket:

Construct is a very interesting library. DWARF is a complex format (as I'm discovering) and pushes Construct to its limits I think.

3 Comments

Hi Craig, do you have any examples of how to use your DWARF parser? I've looked at your repo but couldn't find any. How could I do something like emilmont's dwarf.get_loc_by_addr() example?
@NickToumpelis, I haven't done any more work on this for a while, but I'm now just getting back to it since it could be useful at my work. I'm not entirely happy with the Construct-based solution, because it's slow to do the parsing. So, there's currently no high-level API as you requested. It gets as far as parsing the DWARF info into a tree. The next task would be to search the tree for the info you're looking for. DWARF format is so expressive, I'm not sure what would be a good simple API to access the data.
Craig: pyelftools (bitbucket.org/eliben/pyelftools) is built on top of construct, using it for the low-level API, but adding a feature-full high-level API on top
2

hachior is another library for parsing binary data

Comments

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.