3

I'm writing a Python C extension that wraps an external C library. In the original library there are structs (of type T for the sake of the discussion), so my extension class looks like this:

typedef struct {
  PyObject_HEAD
  T *cdata;
} TWrapperBase;

I also need to look up pointers in Python from time to time, so I exposed a read-only field _cdata that is a cdata pointer as unsigned long long (yes, I know it's not very portable, but it's out of scope now).

Then, I want to be able to add some more methods in Python, but I can't just append them to a class declared in C, so I subclass it and add my new methods:

class TWrapper(TWrapperBase):
    ...

Now, in my C extension code I need a way of accesing cdata field, so I can pass it to library functions. I know that self won't be an instance of TWrapperBase, but rather TWrapper (this Python version). What is the proper way to do this?

static PyObject * doStuff(PyObject *self)
{
  T *cdata_ptr;
  // How to get a pointer to cdata?
  //
  // This looks very unsafe to me, do I have any guarantee of
  // the subclass memory layout?
  // 1. cdata_ptr = ((TWrapperBase*)self)->cdata
  //
  // This is probably safe, but it seems to be a bit of a hassle
  // to query it with a string key
  // 2. cdata_ptr = PyLong_AsVoidPtr(PyObject_GetAttrString(self, "_cdata"))
  do_important_library_stuff(cdata_ptr);
  Py_INCREF(self);
  return self;
}

Thanks!

5
  • Why are you doing Py_INCREF(self); after you do the library stuff? Commented May 3, 2016 at 22:25
  • Is that why option 1. looks unsafe to you? Commented May 3, 2016 at 22:27
  • I'm INCREFing it only because it's a return type and returned objects are stolen references. Commented May 3, 2016 at 22:46
  • 1
    You may want to INCREF before you do anything with the object, such as accessing cdata. As you say, it is a stolen reference and the destructor may invalidate self->cdata. Commented May 4, 2016 at 5:07
  • You're right, thanks! Commented May 4, 2016 at 7:23

1 Answer 1

3
  // This looks very unsafe to me, do I have any guarantee of
  // the subclass memory layout?
  // 1. cdata_ptr = ((TWrapperBase*)self)->cdata

Yeah, that works. You can look at all the implementations of Python's built-in types and see that they do pretty much the same thing, usually without checking whether they're operating on a subclass instance.

Sign up to request clarification or add additional context in comments.

4 Comments

Oh really? So is it always like it, that fields and other stuff needed in a subclass is laid out in memory after data of it's base class? Is that a part of some standard, or it's just how is Python always implemented?
@wesolyromek: It's how the CPython implementation works. Note that "fields" defined in Python are usually __dict__ entries rather than the kind of struct fields you'd define in C, so Python-level subclasses don't usually do much to the memory layout of their superclasses. (If you use __slots__, that changes things.)
So (as long as I don't use __slots__) it works like this: all objects' layouts start with what their C base classes have, and then only some pointers of these dictionaries and other fields are modified to get subclasses, right?
@wesolyromek: __dict__ usually gets added to the layout, and __weakref__, a thing involved in supporting weak references, sometimes gets added too. There's also a weird case where the __dict__ can appear before the superclass parts of the object layout if the superclass has variable-length instances, like with tuple. In that case, the PyObject * that points to the object actually points somewhere in the middle, so you still don't need to do anything special to handle subclass instances.

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.