1

I have been trying to optimize a code segment by making calls to functions inside it parallel using prange. This requires that all the functions inside the prange block run with nogil, so I am in the process of adapting them to not use the GIL. However, when trying to adapt one of the functions I'm running into a problem regarding Python Locals/Temporaries. The function is below:

cdef float output(Branch self):
        cdef float output = 0.0
        cdef Dendrite dendrite   # This line is considered a Python Local
        cdef Branch branch       # This line is considered a Python Local
        cdef int index = 0
        while index < self.length:
            if self.isDendrite:
                dendrite = self.inputs[index]
                output += dendrite.output()
            else:
                branch = self.inputs[index]
                output += branch.output()

            index += 1
        return self.activation.eval(output) * self.weight

When trying to convert the function to run nogil, the following error message is returned by Cython:

Function declared nogil has Python locals or temporaries

pointing to the function header.

For context, these are the fields that the Branch and Dendrite classes own (Node is another cdef class that is referenced by Dendrite):

cdef class Branch():
    cdef:
        np.ndarray inputs     # Holds either Dendrite or Branch objects (but never both)
        float weight
        floatFunc activation
        bint isDendrite       # Used to determine if Dendrites or Branches are held
        int length

cdef class Dendrite():
    cdef:
        float charge
        Node owner            # The class below
        float weight
        np.ndarray links      # Holds objects that rely on Node and Dendrite
        floatFunc activation  # C-optimized class
        int length

cdef class Node:
    cdef:
        float charge
        float SOMInhibition
        np.ndarray senders         # Holds objects that rely on Node and Dendrite
        int numSenders
        np.ndarray position        # Holds ints
        NodeType type              # This is just an enum
        np.ndarray dendrites       # This holds Dendrite objects
        int numDendrites
        np.ndarray inputDendrites  # Holds Dendrite objects
        int numInputDendrites
        np.ndarray inputBranches   # Holds Branch objects
        int numInputBranches
        int ID
        floatFunc activation       # C-optimized class

My guess is that this has something to do with the fact that the classes have NumPy arrays as fields, but NumPy is compatible with Cython and should not be making Python objects (if I understood correctly).

How can I make it so that those lines are not counted as Python objects?

It has been mentioned that untyped NumPy arrays do not provide much benefit in terms of performance. In the past I tried to type them during class declaration, but Cython threw a compile error when it saw the type identifiers in the class declaration. I do type the arrays in the initializer before assigning them to fields though, so does that still work, or is the typing during initialization not relevant?

4
  • 1
    self.inputs is an np.ndarray which contains objects and not typed objects. This can be the source of the error. Numpy object cannot contains custom native C types other than native primitive type (eg. np.int32, np.float64, etc.). The same thing applies with other np.ndarray-typed attributes. Commented Oct 15, 2023 at 10:29
  • Yes - np.ndarray without typing the element gives you very little advantage in Cython, and will require the GIL for most operations (except possibly accessing the shape) Commented Oct 15, 2023 at 11:56
  • 1
    But fundamentally, a cdef class is a Python object so will need the GIL Commented Oct 15, 2023 at 11:57
  • You can't use typed Numpy arrays as an attribute of a cdef class (as you note) but you can use typed memoryviews which are their more modern replacement Commented Oct 15, 2023 at 18:06

2 Answers 2

1

The major problem your program is having here is that your cdef classes' definitions have an attribute that is a container of cdef classes, and, the accessing of the element of the container of cdef classes is not allowed without GIL. According to the codes you are giving here, you want to do Object-Oriented Computation. Unfortunately, it is currently unsupported or safe if you want to achieve this and highly parallelizable at the same time by only Cython and Python codes.

References:

cython: how do you create an array of cdef class

https://groups.google.com/forum/#!topic/cython-users/G8zLWrA-lU0

One of the safest solution (requires more coding) is to implement the Object-Oriented Computational part, i.e., class Branch, Dendrite and Node in C++ and declare and import them into Cython as cppclasses following the guide in here. Then, manipulate the classes as nogil cppclasses, instead of cdef classes. In the cppclass, you can declare the container as a pointer to the cppclass.

Example:

test_class.h

#ifndef TEST_CLASS_H
#define TEST_CLASS_H


class test_class {
    public:
        test_class * container;
        int b;
        test_class();
        test_class(int, int);
        test_class(int);
        ~test_class();
        int get_b();
        test_class* get_member(int);
};

#endif

test_class.cpp

#include "test_class.h"
#include <cstddef>

test_class::test_class(){
    this->b = 0;
    this->container = NULL;
}

test_class::test_class(int b){
    this->b = b;
    this->container = NULL;
}

test_class::test_class(int b, int n){
    this->b = b;
    if (n > 0){
        this->container = new test_class[n];
        for (int i = 0; i < n; i++) {
            this->container[i].b = i;
        }
    }
}

test_class::~test_class(){
        delete[] this->container;
}
int test_class::get_b(){
    return this->b;
}
test_class* test_class::get_member(int i){
    return &(this->container[i]);
}

Cython codes:

# distutils: language = c++

cdef extern from "test_class.cpp":
    pass

cdef extern from "test_class.h":
    cdef cppclass test_class nogil: #add nogil here
        test_class * container
        int b
        test_class() except +
        test_class(int) except +
        test_class(int, int) except +
        int get_b()
        test_class * get_member(int)

cdef int inner_function(test_class * class_obj) nogil: #nogil is allowed
    cdef int i
    cdef int b = 0
    cdef test_class * inner_obj

    for i in range(10):
        inner_obj = class_obj.get_member(i)
        b += inner_obj.get_b()
    return b

def outer_function():
  cdef test_class * class_obj = new test_class(999, 10)
  cdef int sum_b = inner_function(class_obj)
  del class_obj
  return sum_b #output: 45

Finally, regarding the main question: Why does Cython consider cdef classes as Python objects? Because they want to provide an intermediate between Python Class and Cppclass, that is called cdef class(Cython class). Probably in the future, they will support defining cppclasses in Cython as a standard feature with documentation.

References:

https://www.nexedi.com/NXD-Document.Blog.Cypclass

Sign up to request clarification or add additional context in comments.

Comments

1

NumPy has both Python and C APIs, and you can even do:

import numpy as np
cimport numpy as cnp

I know this is confusing and to be honest, I'm not sure exactly where Python objects are being held in your code, but you don't need to worry about it; in general classes and function calls have very little overhead (I know Cython marks them as dark yellow, meaning expensive, but usually they are called a few times to process millions of items, at least if you have designed your system correctly).

Try to define your classes and infrequent functions in Python and pass memory views to Cython to do the bulk of the computation. This way you can release the GIL as memory views are not Python objects and you don't have to worry about converting everything to C++ (If that's your goal, why not write the library in C++ to start with and then write a Cython wrapper to use that library?).

2 Comments

The ndarrays in the classes' definitions are already imported through the C API. I am not very concerned about the overhead of the function call, but the function shown above must be called many times during execution, so I wanted to make it nogil to run the calls in parallel (each call is done by a different instance of Branch on different data, so unfortunately I can't make each call process more data).
I had a similar problem when trying to perform computations on matrices of varying size (there was a mixture of 3x3, 4x4, ... 17x17). I ended up having to flatten and append the matrices and pass in size and index information as 2 separate arrays which solved the issue. I know this is a somewhat convoluted method of solving the problem, but as I said, try defining your functions and classes in Python and convert the compute part to Cython. If you post more detailed examples with runnable code, I can help you profile and optimize this further

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.