1

I'm trying to write a C extension that accepts numpy arrays as inputs. Everything works fine except when I pass in a string as an argument.

#define NPY_NO_DEPRECATED_API NPY_1_7_API_VERSION
#include "../../include/Python.h"
#include "../../include/arrayobject.h"

static PyObject *max(PyObject *self, PyObject *args)
{
    PyArrayObject *arr;
    long i, n, strides;

    if (PyArg_ParseTuple(args, "O!", &PyArray_Type, &arr)){
        /* Get some info about the data. */
        n           = PyArray_DIMS(arr)[0];
        strides     = PyArray_STRIDES(arr)[0];
        void *data0 = PyArray_DATA(arr);
        int typenum = PyArray_TYPE(arr);

        if (typenum == NPY_DOUBLE){
            double max = *(double *)data0;
            for (i=0; i<n; ++i){
                if (*(double *)data0 > max){
                    max = *(double *)data0;
                }
                data0 += strides;
            }
            return Py_BuildValue("d", max);
        }
        else if (typenum == NPY_LONG){
            long max = *(long *)data0;
            for (i=0; i<n; ++i){
                if (*(long *)data0 > max){
                    max = *(long *)data0;
                }
                data0 += strides;
            }
            return Py_BuildValue("l", max);
        }
        else {
            PyErr_Format(
                PyExc_TypeError, "\rInput should be a numpy array of numbers."
            );
            return NULL;
        }
    }
    else{
        PyErr_Format(
            PyExc_TypeError, "\rInput should be a numpy array of numbers."
        );
        return NULL;
    }
}

static PyMethodDef DiffMethods[] =
{
    {"max", max, METH_VARARGS, "Compute the maximum of a numpy array."},
    {NULL, NULL, 0, NULL}
};

static struct PyModuleDef cModPyDem =
    {PyModuleDef_HEAD_INIT, "_math_functions", "", -1, DiffMethods};

PyMODINIT_FUNC PyInit__math_functions(void)
{
    import_array();
    return PyModule_Create(&cModPyDem);
}

I then run this setup.py script:

def configuration(parent_package=None, top_path=None):
    import numpy
    from numpy.distutils.misc_util import Configuration
    config.add_extension('_math_functions', ['_math_functions.c'])

    return config

if __name__ == "__main__":
    from numpy.distutils.core import setup
    setup(configuration=configuration)

With these commands:

python setup.py config --compiler=gnu99 build_ext --inplace
rm -rf build/

And that works nicely. The function works for the most part:

In [1]: import _math_functions as mf

In [2]: import numpy as np

In [3]: x = np.random.randint(-1e3, 1e3, size=100)

In [4]: np.max(x), mf.max(x)
Out[4]: (998, 998)

In [5]: x = np.random.rand(100)

In [6]: np.max(x), mf.max(x)
Out[6]: (0.9962604850115798, 0.9962604850115798)

It can also handle inappropriate inputs, somewhat:

In [7]: x = np.array([1,2,"bob"])

In [8]: mf.max(x)
---------------------------------------------------------------------------
TypeError                                 Traceback (most recent call last)
<ipython-input-8-7ced17af9505> in <module>()
----> 1 mf.max(x)

Input should be a numpy array of numbers.

In [9]: mf.max("bob")
---------------------------------------------------------------------------
TypeError                                 Traceback (most recent call last)
<ipython-input-9-a656f60cf00d> in <module>()
----> 1 mf.max("bob")

Input should be a numpy array of numbers.

The problem occurs with the following input:

In [10]: x = np.array("Bob")

In [11]: mf.max(x)
Segmentation fault: 11

EDIT: Some things I've tried. Using:

PyArg_ParseTuple(args, "O", &arr)

Instead, this still gave a seg fault. I also put printf("i") before every line (With i=1, 2, ...), so I'm sure the segfault happens at PyArg_ParseTuple.

I read through the documentation and found the "O&" option, but could not get that to work. Any advice on how to properly use that is welcome.

I've also gone through these relevant posts: PyArg_ParseTuple causing segmentation fault

PyArg_ParseTuple SegFaults in CApi (Not sure how the solution to this one would apply...)

Crash when calling PyArg_ParseTuple on a Numpy array

Any clues on how to properly handle this? The output I want is a TypeError being raised.

Thanks!

2 Answers 2

1

Have you tried adding debug statements to figure out exactly where in the code is the segmentation fault happening?.

Assuming that the segmentation fault happens here:

if (PyArg_ParseTuple(args, "O!", &PyArray_Type, &arr)) {

Try adding a different parsing mechanism like "O" in order to not assume that a PyArrayObject instance was passed; then try using the resulting generic PyObject* and check for its type (see PyObject_TypeCheck) and control the flow of the program depending on the type. The way of raising exceptions from here is explained in the Extensions documentation, but I think it goes something like:

PyErr_SetString(PyExc_TypeError, "Input should be a numpy array of numbers.");
return NULL;
Sign up to request clarification or add additional context in comments.

1 Comment

Hello, sorry for the delayed response. This is, unfortunately, one of the things I've already tried. I also looked into the "O&" option, but couldn't get that to work. I'd imagine the source code for numpy's max function would suffice, but I couldn't find it. I'll update my post with things I've tried so far.
1

Oh my, the problem had nothing to do with strings at all. If the input is zero-dimensional, PyArray_DIMS and PyArray_STRIDES return NULL, and therein lies the problem. I put more print statements and the program does indeed get past PyArg_ParseTuple. I truly am a fool. Here's a full working example, I simply added a check for those two pointers.

#define NPY_NO_DEPRECATED_API NPY_1_7_API_VERSION
#include "../../include/Python.h"
#include "../../include/arrayobject.h"

static PyObject *max(PyObject *self, PyObject *args)
{
    PyArrayObject *arr;
    npy_int i, n, strides;
    void *data0;

    if (PyArg_ParseTuple(args, "O!", &PyArray_Type, &arr)){

        // Check to make sure input isn't zero dimensional!
        if ((PyArray_DIMS(arr) == NULL) || (PyArray_STRIDES(arr) == NULL)){
            PyErr_Format(PyExc_TypeError,
                         "Input is zero-dimensional.");
            return NULL;
        }

        // Useful information about the data.
        int typenum = PyArray_TYPE(arr);
        n           = PyArray_DIMS(arr)[0];
        strides     = PyArray_STRIDES(arr)[0];
        data0       = PyArray_DATA(arr);

        if (typenum == NPY_DOUBLE){
            double max = *(double *)data0;
            for (i=0; i<n; ++i){
                if (*(double *)data0 > max){
                    max = *(double *)data0;
                }
                data0 += strides;
            }
            return Py_BuildValue("d", max);
        }
        else if (typenum == NPY_LONG){
            long max = *(long *)data0;
            for (i=0; i<n; ++i){
                if (*(long *)data0 > max){
                    max = *(long *)data0;
                }
                data0 += strides;
            }
            return Py_BuildValue("l", max);
        }
        else {
            PyErr_Format(PyExc_TypeError,
                         "Input should be a numpy array of numbers.");
            return NULL;
        }
    }
    else{
        PyErr_Format(PyExc_TypeError,
                     "Input should be a numpy array of numbers.");
        return NULL;
    }
}

static PyMethodDef DiffMethods[] =
{
    {"max", max, METH_VARARGS, "Compute the maximum of a numpy array."},
    {NULL, NULL, 0, NULL}
};

static struct PyModuleDef cModPyDem =
    {PyModuleDef_HEAD_INIT, "_math_functions", "", -1, DiffMethods};

PyMODINIT_FUNC PyInit__math_functions(void)
{
    import_array();
    return PyModule_Create(&cModPyDem);
}

Build is the same as before. This passed every test I threw at it thus far:

In [1]: import numpy as np                                                                                                                                  

In [2]: import _math_functions as mf                                                                                                                        

In [3]: for i in range(1000): 
   ...:     for j in range(10): 
   ...:         x = np.random.rand((i+1)*100) 
   ...:         if ((np.max(x) - mf.max(x)) != 0): 
   ...:             print(i, j) 
   ...:         x = np.random.randint(-1e13*(i+1), 1e13*(i+1), size=1000) 
   ...:         if ((np.max(x) - mf.max(x)) !=0): 
   ...:             print(i, j) 
   ...:                                                                                                                                                     
# Nothing prints, so np.max and mf.max are spitting out the same answer.
In [4]: mf.max("Bob")                                                                                                                                       
---------------------------------------------------------------------------
TypeError                                 Traceback (most recent call last)
<ipython-input-4-bc67f3e1c10d> in <module>
----> 1 mf.max("Bob")

TypeError: Input should be a numpy array of numbers.

In [5]: mf.max(np.array(1))                                                                                                                                 
---------------------------------------------------------------------------
TypeError                                 Traceback (most recent call last)
<ipython-input-5-1cb4380527fa> in <module>
----> 1 mf.max(np.array(1))

TypeError: Input is zero-dimensional.

In [6]: mf.max(np.array("Bob"))                                                                                                                             
---------------------------------------------------------------------------
TypeError                                 Traceback (most recent call last)
<ipython-input-6-47b1925b8c3c> in <module>
----> 1 mf.max(np.array("Bob"))

TypeError: Input is zero-dimensional.

Comments

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.