2

I am trying to create a helper class to execute a system command and get response back with piping support. For the cases where I need to get the response only (no STDIN to consume for the command) it is working as expected, for pipe support, I am getting garbled STDIN and I can not find out the root cause.

The main function which handles this mechanism is (please ignore the minor error check issues)

the minimal working example

#include <vector>
#include <string>
#include <iostream>
#include <sstream>
#include <unistd.h>
#include <sys/prctl.h>
#include <sys/types.h>
#include <signal.h>
#include <sys/wait.h>
#include <fcntl.h>
#include <stdarg.h>

struct exec_cmd_t {
    exec_cmd_t(std::vector<std::string> args) : args(args), has_executed(false), cpid(-1) { }
    exec_cmd_t(const exec_cmd_t &) = delete;
    exec_cmd_t(exec_cmd_t &&) = delete;
    exec_cmd_t & operator=(const exec_cmd_t &) = delete;
    exec_cmd_t & operator=(exec_cmd_t &&) = delete;

    std::string operator()();

    std::string pipe_cmd(const std::string & input);
    std::string pipe_cmd();
    ~exec_cmd_t();

    private:
    std::vector<std::string> args;
    bool has_executed;
    int cpid;
    std::stringstream in_stream;
    std::stringstream out_stream;

    friend std::string operator | (exec_cmd_t & first, exec_cmd_t & second);
    friend std::string operator | (exec_cmd_t && first, exec_cmd_t && second);
    friend std::string operator | (std::string, exec_cmd_t & second);
    friend std::string operator | (std::string, exec_cmd_t && second);
};

std::string exec_cmd_t::pipe_cmd(const std::string & input) {
    this->has_executed = true;
    const int read_end = 0;
    const int write_end = 1;

    int read_pipe[2];
    int write_pipe[2];

    if (pipe(read_pipe) < 0 || pipe(write_pipe) < 0) {
        this->has_executed = false;
        return std::string{};
    }

    this->in_stream << input;
    std::string line;
    while(getline(this->in_stream, line)) {
        if (line.size() == 0) {
            continue;
        }
       int wr_sz = write(write_pipe[write_end], line.c_str(), line.size());
       if (wr_sz <= 0) {
           break;
       }
       write(write_pipe[write_end], "\n", 1);
    }
    close(write_pipe[write_end]);

    this->cpid = fork();
    if (this->cpid == 0) {
        dup2(write_pipe[read_end], STDIN_FILENO);
        dup2(read_pipe[write_end], STDOUT_FILENO);
        close(read_pipe[read_end]);
        close(write_pipe[write_end]);
        close(read_pipe[write_end]);
        close(write_pipe[read_end]);
        prctl(PR_SET_PDEATHSIG, SIGTERM);
        char * params[args.size()];
        const char * image_path = args[0].c_str();
        for(int i = 1; i < args.size(); i++) {
            params[i-1] = const_cast<char *>(args[i].c_str());
        }
        params[args.size()] = nullptr;
        execv(image_path, params);
        exit(1);
    }

    close(read_pipe[write_end]);
    close(write_pipe[read_end]);

    char buff[256];
    int rd_sz = -1;
    int flags = fcntl(read_pipe[0], F_GETFL, 0);
    fcntl(read_pipe[read_end], F_SETFL, flags | O_NONBLOCK);
    int status = 0;
    waitpid(this->cpid, &status, 0);
    this->has_executed = false;
    int error_code = 0;
    while((rd_sz = read(read_pipe[read_end], buff, sizeof(buff))) > 0) {
        buff[rd_sz] = '\0';
        this->out_stream << std::string{buff};
    }
    close(read_pipe[read_end]);
    return this->out_stream.str();
}


std::string exec_cmd_t::pipe_cmd() {
    static std::string empty_str{};
    return pipe_cmd(empty_str);
}

std::string exec_cmd_t::operator()() {
    return pipe_cmd();
}

exec_cmd_t::~exec_cmd_t() {
    if (this->has_executed) {
        int status;
        waitpid(this->cpid, &status, WNOHANG);
        if (!WIFEXITED(status)) {
            kill(this->cpid, SIGKILL);
            waitpid(this->cpid, &status, 0);
        }
    }
}

std::string operator | (exec_cmd_t & first, exec_cmd_t & second) {
    return second.pipe_cmd(first());
}

std::string operator | (exec_cmd_t && first, exec_cmd_t && second) {
    return second.pipe_cmd(first());
}

std::string operator | (std::string output, exec_cmd_t & second) {
    return second.pipe_cmd(output);
}

std::string operator | (std::string output, exec_cmd_t && second) {
    return second.pipe_cmd(output);
}

int main() {
    auto str = exec_cmd_t{ {"/bin/echo", "echo", "hello\nworld\nor\nnot"} } | exec_cmd_t{ {"/bin/grep", "grep", "world", "-"} };
    std::cout << str << std::endl;
    return 0;
}

gives me

grep: =V: No such file or directory                                                                                                                                                   
(standard input):world 

It seems like grep is executing twice, one failing with no such file or directory and another one is succeeding. Any suggestion would be very helpful :-) . Thanks in advance.

11
  • 1
    Your code seems to be C++, so you should remove the C tag. Your code is incomplete. Please show enough code to allow us to compile it and to reproduce the problem. Commented Sep 23, 2019 at 17:49
  • 1
    Yes, I don't think you should filter out anything in your pipe:ing. It can get confusing later on when you use it in real code and expect the blank lines. Commented Sep 23, 2019 at 18:46
  • 1
    @TedLyngmo agreed. Thank your for the suggestion :-) Commented Sep 23, 2019 at 18:47
  • 1
    @TedLyngmo Good point, thanks I will update the findings after testing. Commented Sep 23, 2019 at 21:54
  • 1
    Ok, I was going to try your code out myself but it seems to be missing some parts. Can you perhaps put together a minimal reproducible example? It looks interesting. Commented Sep 24, 2019 at 16:58

1 Answer 1

1

You have at east one cause for Undefined Behaviour that may cause your program to do what it does. You declare and use a VLA out-of-range like this:

char* params[args.size()];
...
params[args.size()] = nullptr;
execv(image_path, params);

This leaves the terminating char* in your params uninitialized so it could point anywhere. grep thinks it points at a filename, tries to open it and fails.

Since VLA:s aren't in the C++ standard, consider changing it to:

std::vector<char*> params(args.size());
...
params[args.size() - 1] = nullptr;
execv(image_path, params.data());

Another cause for concern is that you use ints where you should have used ssize_ts even though it's highly unlikely that you've read or written more than an int could handle.

After I made those changes, it started working and printed the expected world. I even added a third command to check it could handle it. Suggested changes:


14,15c14,15
<     exec_cmd_t(std::vector<std::string> args) :
<         args(args), has_executed(false), cpid(-1) {}
---
>     exec_cmd_t(std::vector<std::string> Args) :
>         args(Args), has_executed(false), cpid(-1), in_stream{}, out_stream{} {}
59c59
<         int wr_sz = write(write_pipe[write_end], line.c_str(), line.size());
---
>         ssize_t wr_sz = write(write_pipe[write_end], line.c_str(), line.size());
76c76
<         char* params[args.size()];
---
>         std::vector<char*> params(args.size());
78c78
<         for(int i = 1; i < args.size(); i++) {
---
>         for(decltype(args.size()) i = 1; i < args.size(); i++) {
81,82c81,82
<         params[args.size()] = nullptr;
<         execv(image_path, params);
---
>         params[args.size() - 1] = nullptr;
>         execv(image_path, params.data());
90c90
<     int rd_sz = -1;
---
>     ssize_t rd_sz = -1;
96c96
<     int error_code = 0;
---
>     // int error_code = 0; // unused
106,107c106
<     static std::string empty_str{};
<     return pipe_cmd(empty_str);
---
>     return pipe_cmd({});
143c142,143
<                exec_cmd_t{{"/bin/grep", "grep", "world", "-"}};
---
>                exec_cmd_t{{"/bin/grep", "grep", "-A1", "hello"}} |
>                exec_cmd_t{{"/bin/grep", "grep", "world"}};

I also realized that your program acts like a proxy between the piped commands, reading everything from one command and writing it to the next.

You could start all programs at the same time and setup the pipes between the started programs in one go. For three commands, you'd need three pipes:

                     cmd1  cmd2  cmd3
                     |  w--r  w--r  |
                 stdin              read output into program
or fed by your program

This would make performance and memory consumption less of an issue if you decide to run commands with a lot of output. Internally you'd would only need to store what you'd like to store by reading the output from the last command. I made a small test of this approach and it works like a charm.

Sign up to request clarification or add additional context in comments.

1 Comment

thanks, it was the UB (literally never crossed my mind). I was looking for the problem in wrong place. Although VLA works after the fix, but I think your suggestion is much better than VLAs. The performance improvement suggestion is really good one. Thanks again and have a great week!

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.