0

Edit (solution)

I've followed the advice of debugging with -fsanitize=address & valgrind. I only used -fsanitize (which I never heard of before) and found out what was the problem, there was a left over call for a destructor in another function and the object was being destroyed twice. The memory was completely jeopardised at this point.

Thanks a lot for the help and the other recommendations too.


I'm writing a code in C++ to talk with CouchDB using sockets (CouchDB is a Database by Apache that has an HTTP API). I've created a whole class to deal with it and it's basically a socket client that connects and closes.

One of my functions is to send an HTTP request and then read the response and work with it, it works fine on the first call, but fails when I call it a second time.

But it's inconsistent where it fails, sometimes it's a SEGFAULT inside of it in one of the string functions, other times it's a SIGABORT in the return. I've signalled the lines where it crashed with ->

And the worst part is that it only fails when it runs for the "second" time, which is actually the 10th time. Explanation: When the class is instantiated a socket is created, sendRequest is called 8 times (all work, always), I close the socket. Then I have another class that controls a socket server, which receives commands and creates a remote user object that executes the command, the remote user command then calls the CouchDB class to manipulate the DB. The first time a command is requested works, but the second fails and crashes the program.

Extra info: In the short int httpcode line, gdb trace shows it's a crash on substr, on the SIGABORT crash trace shows a problem on free().

I've already debugged many times, made some changes as to where and how to instantiate the string and the buffer and I'm lost. Anyone knows why it would work fine many times but crash on a subsequent call?

CouchDB::response CouchDB::sendRequest(std::string req_method, std::string req_doc, std::string msg)
{
    std::string responseBody;
    char buffer[1024];
    // zero message buffer
    memset(buffer, 0, sizeof(buffer));

    std::ostringstream smsg;
    smsg << req_method << " /" << req_doc << " HTTP/1.1\r\n"
         << "Host: " << user_agent << "\r\n"
         << "Accept: application/json\r\n"
         << "Content-Length: " << msg.size() << "\r\n"
         << (msg.size() > 0 ? "Content-Type: application/json\r\n" : "")
         << "\r\n"
         << msg;

    /*std::cout << "========== Request ==========\n"
              << smsg.str() << std::endl;*/

    if (sendData((void*)smsg.str().c_str(), smsg.str().size())) {
        perror("@CouchDB::sendRequest, Error writing to socket");
        std::cerr << "@CouchDB::sendRequest, Make sure CouchDB is running in " << user_agent << std::endl;
        return {-1, "ERROR"};
    }

    // response
    int len = recv(socketfd, buffer, sizeof(buffer), 0);

    if (len < 0) {
        perror("@CouchDB::sendRequest, Error reading socket");
        return {-1, "ERROR"};
    }
    else if (len == 0) {
        std::cerr << "@CouchDB::sendRequest, Connection closed by server\n";
        return {-1, "ERROR"};
    }

    responseBody.assign(buffer);
    // HTTP code is the second thing after the protocol name and version
->  short int httpcode = std::stoi(responseBody.substr(responseBody.find(" ") + 1));

    bool chunked = responseBody.find("Transfer-Encoding: chunked") != std::string::npos;
    /*std::cout << "========= Response =========\n"
              << responseBody << std::endl;*/
    // body starts after two CRLF
    responseBody = responseBody.substr(responseBody.find("\r\n\r\n") + 4);

    // chunked means that the response comes in multiple packets
    // we must keep reading the socket until the server tells us it's over, or an error happen
    if (chunked) {
        std::string chunkBody;
        unsigned long size = 1;

        while (size > 0) {
            while (responseBody.length() > 0) {
                // chunked requests start with the size of the chunk in HEX
                size = std::stoi(responseBody, 0, 16);
                // the chunk is on the next line
                size_t chunkStart = responseBody.find("\r\n") + 2;
                chunkBody += responseBody.substr(chunkStart, size);
                // next chunk might be in this same request, if so, there must have something after the next CRLF
                responseBody = responseBody.substr(chunkStart + size + 2);
            }

            if (size > 0) {
                len = recv(socketfd, buffer, sizeof(buffer), 0);

                if (len < 0) {
                    perror("@CouchDB::sendRequest:chunked, Error reading socket");
                    return {-1, "ERROR"};
                }
                else if (len == 0) {
                    std::cerr << "@CouchDB::sendRequest:chunked, Connection closed by server\n";
                    return {-1, "ERROR"};
                }

                responseBody.assign(buffer);
            }
        }
        // move created body from chunks to responseBody
->      responseBody = chunkBody;
    }
    return {httpcode, responseBody};
}

The function that calls the above and that sometimes SIGABORT

bool CouchDB::find(Database::db db_type, std::string keyValue, std::string &value)
{
    if (!createSocket()) {
        return false;
    }
    std::ostringstream doc;
    std::ostringstream json;
    doc << db_name << db_names[db_type] << "/_find";
    json << "{\"selector\":{" << keyValue << "},\"limit\":1,\"use_index\":\"index\"}";
->  CouchDB::response status = sendRequest("POST", doc.str(), json.str());
    close(socketfd);

    if (status.httpcode == 200) {
        value = status.body;
        return true;
    }
    return false;
}

Some bits that you might have questions about:

  • CouchDB::response is a struct {httpcode: int, body: std::string}
  • CouchDB::db is an enum to choose different databases
  • sendData only sends anything as bytes until all bytes are sent
9
  • How do you know that the error isn't somewhere else entirely, for example corrupting memory in such a way that the program crashes half an hour later? Commented Nov 21, 2017 at 6:11
  • @user4581301 I did not think about this. I would need to make some changes in one of the classes that might be the problem to see if it's something there. If you have any advice on how I could find this do tell. Commented Nov 21, 2017 at 6:17
  • To be honest that was mostly me being snarky because you gave two snippets of code and we out here have no way of knowing that the program is sane when entering the provided code. Currently my suspicion is more along the lines of the recv calls not returning as much data as you think they are. A recv that does not fail or disconnect can return any amount from 0 to sizeof(buffer), so the first recv could contain too little for you to find and read a valid integer or the chunked tag. It could also contain data you expect to find in the next recv. Commented Nov 21, 2017 at 6:23
  • recv when successful returns the number of characters received, this number is essential, use it. There is no automatic null termination of buffers. Commented Nov 21, 2017 at 6:40
  • 3
    There is also no guarantee whatsoever that your entire message will be read with a single call to recv, even if its size is no greater than the number of characters requested. Commented Nov 21, 2017 at 6:45

1 Answer 1

1

Make it int len = recv(socketfd, buffer, sizeof(buffer), 0); might be overwriting the last '\0' in your buffer. One might be tempted to use sizeof(buffer) - 1 but this would be wrong as you might be getting null bytes in your stream. So, do this instead: responseBody.assign(buffer, len);. Only do this of course after you've made sure len >= 0, which you do in your error checks.

You have to do that every place where you call recv. Though, why you're using recv instead of read is beyond me, since you aren't using any of the flags.

Also, your buffer memset is pointless if you do it my way. You should also declare your buffer right before you use it. I had to read through half your function to figure out if you did anything with it. Though, of course, you do end up using it a second time.

Heck, since your error handling is basically identical in both cases, I would just make a function that did it. Don't repeat yourself.

Lastly, you play fast and loose with the result of find. You might not actually find what you're looking for and might get string::npos back instead, and that'd also cause you interesting problems.

Another thing, try -fsanitize=address (or some of the other sanitize options documented there) if you're using gcc or clang. And/or run it under valgrind. Your memory error may be far from the code that's crashing. Those might help you get close to it.

And, a very last note. Your logic is totally messed up. You have to separate out your reading data and your parsing and keep a different state machine for each. There is no guarantee that your first read gets the entire HTTP header, no matter how big that read is. And there is no guarantee that your header is less than a certain size either.

You have to keep reading until you've either read more than you're willing to for the header and consider it an error, or until you get the CR LN CR LN at the end of the header.

Those last bits won't cause your code to crash, but will cause you to get spurious errors, especially in certain traffic scenarios, which means that they will likely not show up in testing.

Sign up to request clarification or add additional context in comments.

Comments

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.