3

I'm trying to implement a debugging helper, which should stringify an xml node. I'm using gdb 7.2s python interface to do this. The idea is to get the nodes address, then pass it to the xml library using ctypes.

I've managed to get the xml nodes address (a gdb.Value) and I can call functions in the xml library. But somehow, the ends don't meet.

// prototype of functions to call
int xmlNodeDump (xmlBufferPtr buf, xmlDocPtr doc, xmlNodePtr cur, int level, int format);
xmlBufferPtr xmlBufferCreate(void);

And the python part calling this function:

# this is xmlBuffer
class lxmlBufferStruct(Structure):
    _fields_ = [('content', POINTER(c_ubyte)),
        ('use', c_uint), ('size', c_uint),
        ('alloc', c_int), ('contentIO', POINTER(c_ubyte))]
pNode # gdb.Value containing the addr of xmlNodePtr cur
pDoc # gdb.Value  containing addr of xmlDocPtr doc

libxml2 = CDLL('libxml2.so.2')
xmlBufferCreate = libxml2.xmlBufferCreate
xmlBufferCreate.restype = POINTER(lxmlBufferStruct)
xmlBuf = xmlBufferCreate()
libxml2.xmlNodeDump(buf, c_void_p(int(str(pDoc), 16)), 
    c_void_p(int(str(pNode), 16)), 0, 0)

This usually gives me a gdb crash at xmlNodeDump. Any hints of what I'm doing wrong?

2
  • I would be curious of those addresses are still valid - as in, define the xmlDoc and xmlNode structures, and try to cast the void ptrs to pointers of those types & access the fields. Commented Apr 12, 2011 at 18:14
  • Another angle to consider - I'm not sure if it's safe to assume that the alloc field is the size of an int since it's an enum. This could wreak havoc on alignment of the contentIO field. Commented Apr 12, 2011 at 18:19

1 Answer 1

3

Think about what you are doing. It can't possibly work!

You get a gdb.Value, representing the address of xmlNodePtr in the inferior (being debugged) process.

You then pass that address into libxml2.so.2, loaded into GDB itself.

But the address in the inferior is very likely inaccessible within GDB. If by chance it is accessible, it almost certainly does not point to an xmlNode. And if by miracle it does point to xmlNode, it would still not be the node you want (not the one in the inferior process).

There are two ways to fix this.

  • If you have a live inferior process (i.e. you are not doing post-mortem debugging), you can simply call xmlNodeDump from gdb: call xmlNodeDump(a_pointer)
  • If you are doing post-mortem debugging, or just don't want to call into the inferior process (doing so "disturbs" the inferior), you have to re-implement xmlNodeDump entirely in Python, using gdb.Value, dereference, cast, etc. etc.
Sign up to request clarification or add additional context in comments.

2 Comments

For some reason I assumed that gdb loads into the same address space as the inferior. If this is not true, it's clear why the behaviour of my script is rather random.
I wondered why the debugging c++ helpers used to work in earlier times: They were injected into the debuggee process.

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.