1

Recently, I needed to create a tool to scrape a page's source so I could parse out of a public database, for a project that I'm working on. Python seemed like an easy solution but it was a pain getting it up and running and currently I have it half working (saves source to file instead of returning). When I run my c++ code I get a strange error...

Exception ignored in: <module 'threading' from 'C:\\Python34\\Lib\\threading.py'
>
Traceback (most recent call last):
  File "C:\Python34\Lib\threading.py", line 1293, in _shutdown
    t = _pickSomeNonDaemonThread()
  File "C:\Python34\Lib\threading.py", line 1300, in _pickSomeNonDaemonThread
    for t in enumerate():
  File "C:\Python34\Lib\threading.py", line 1270, in enumerate
    return list(_active.values()) + list(_limbo.values())
TypeError: an integer is required (got type NoneType)

My Python Code:

import urllib.request
import sys

def run(a):
    req = urllib.request.Request(a)
    res = urllib.request.urlopen(req)
    d = str(res.read())

    with open('temp.dat', 'w') as outfile:
        for x in range(0, len(d)):
            outfile.write(d[x])

The above code works correctly and doesn't issue any errors, so I feel that the mistake is somewhere in my c++ implementation. Anyways, I feel that it is worth mentioning that it successfully saves the websites (parameter a) source code to the 'temp.dat' file, I'm just trying to get rid of the error reporting.

My C++ code:

void pyCall(string url, string outfile, char* mod = "Scrape", char * dat = "run")
{
PyObject *pName, *pModule, *pDict, *pFunc;
PyObject *pArgs, *pValue, *pOutfile, *pURL;
int i;

Py_Initialize();

PyObject* sysPath = PySys_GetObject((char*)"path");
PyList_Append(sysPath, PyUnicode_FromString("."));

pName = PyUnicode_FromString(mod);

/* Error checking of pName left out */

pModule = PyImport_Import(pName);
Py_DECREF(pName);

if (pModule != NULL) 
{
    pFunc = PyObject_GetAttrString(pModule, dat);
    /* pFunc is a new reference */

    if (pFunc && PyCallable_Check(pFunc)) 
    {
            /* pValue reference stolen here: */
            pArgs = Py_BuildValue("(s)", url.c_str());

        pValue = PyObject_CallObject(pFunc, pArgs);
        Py_DECREF(pArgs);
        if (pValue != NULL) 
        {
            printf("Result of call: %ld\n", PyLong_AsLong(pValue));
            Py_DECREF(pValue);
        }
    }

    Py_XDECREF(pFunc);
    Py_DECREF(pModule);
}
Py_Finalize();
}

Now this code is pretty standard and is a 'cookie cutter' example of the code Python has on their API at https://docs.python.org/3.5/extending/embedding.html; The only differences is the way that I pass the arguments and appending the path at the beginning.

Any help would be greatly appreciated.

1 Answer 1

2

Just went outside for a fast break and collected my thoughts about what the error could be and managed to fix it; sorry for the spam. The error was that my python function doesn't return anything and I was trying to assign it to pValue.

TypeError: an integer is required (got type NoneType)

I just took out the assignment in my c++ code and it worked.

PyObject_CallObject(pFunc, pArgs);
Sign up to request clarification or add additional context in comments.

1 Comment

That's why Stack Overflow explicitly requires you to extract a minimal example before posting here.

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.