5

I can't make it work so I get UTF-8 output from CreateProcess() into wstring.

Currently I am running this method to do that but without UTF-8 output:

HANDLE g_hChildStd_OUT_Rd = NULL;
HANDLE g_hChildStd_OUT_Wr = NULL;
HANDLE g_hChildStd_ERR_Rd = NULL;
HANDLE g_hChildStd_ERR_Wr = NULL;

PROCESS_INFORMATION CreateChildProcess(void);
void ReadFromPipe(PROCESS_INFORMATION);

string run(char *command){
    SECURITY_ATTRIBUTES sa;
    sa.nLength = sizeof(SECURITY_ATTRIBUTES);
    sa.bInheritHandle = TRUE;
    sa.lpSecurityDescriptor = NULL;
    if ( ! CreatePipe(&g_hChildStd_ERR_Rd, &g_hChildStd_ERR_Wr, &sa, 0) ) {
        exit(1);
    }
    if ( ! SetHandleInformation(g_hChildStd_ERR_Rd, HANDLE_FLAG_INHERIT, 0) ){
        exit(1);
    }
    if ( ! CreatePipe(&g_hChildStd_OUT_Rd, &g_hChildStd_OUT_Wr, &sa, 0) ) {
        exit(1);
    }
    if ( ! SetHandleInformation(g_hChildStd_OUT_Rd, HANDLE_FLAG_INHERIT, 0) ){
        exit(1);
    }
    char *szCmdline=command;
    PROCESS_INFORMATION piProcInfo;
    STARTUPINFO siStartInfo;
    bool bSuccess = FALSE;
    ZeroMemory( &piProcInfo, sizeof(PROCESS_INFORMATION) );
    ZeroMemory( &siStartInfo, sizeof(STARTUPINFO) );
    siStartInfo.cb = sizeof(STARTUPINFO);
    siStartInfo.hStdError = g_hChildStd_ERR_Wr;
    siStartInfo.hStdOutput = g_hChildStd_OUT_Wr;
    siStartInfo.dwFlags |= STARTF_USESTDHANDLES;
    bSuccess = CreateProcess(NULL,
        szCmdline,     // command line
        NULL,          // process security attributes
        NULL,          // primary thread security attributes
        TRUE,          // handles are inherited
        CREATE_NO_WINDOW,             // creation flags
        NULL,          // use parent's environment
        NULL,          // use parent's current directory
        &siStartInfo,  // STARTUPINFO pointer
        &piProcInfo);  // receives PROCESS_INFORMATION
    CloseHandle(g_hChildStd_ERR_Wr);
    CloseHandle(g_hChildStd_OUT_Wr);
    if ( ! bSuccess ) {

        exit(1);
    }
    DWORD dwRead;
    CHAR chBuf[BUFSIZE];
    bool bSuccess2 = FALSE;
    std::string out = "", err = "";
    for (;;) {
        bSuccess2=ReadFile( g_hChildStd_OUT_Rd, chBuf, BUFSIZE, &dwRead, NULL);
        if( ! bSuccess2 || dwRead == 0 ) break;

        std::string s(chBuf, dwRead);
        out += s;
    }
    dwRead = 0;
    for (;;) {
        bSuccess2=ReadFile( g_hChildStd_ERR_Rd, chBuf, BUFSIZE, &dwRead, NULL);
        if( ! bSuccess2 || dwRead == 0 ) break;

        std::string s(chBuf, dwRead);
        err += s;
    }

    return out;
}

I tried several things but did not succeed in making it working.

Any help is appreciated!

5
  • 1
    Why do you expect the child process to be outputting UTF8 ? fyi std::wstring on Windows is usually used for UTF16. Commented Aug 17, 2016 at 22:44
  • There are some characters like č,ć,ž that are printed when the command is executed using CreateProcess() so that's why I need it with wstring. Commented Aug 17, 2016 at 22:50
  • 1
    They are most likely MBCS on a code page you would need to determine. Commented Aug 17, 2016 at 23:09
  • Pipes deal in raw bytes, not characters. What do the raw bytes actually look like in the output you are having trouble with? If you post the bytes here, and the string output you are expecting, someone can likely help identify the encoding being used. Commented Aug 18, 2016 at 6:06
  • I got output like that: prntscr.com/c7982a , but it should be like that: prntscr.com/c7989y Commented Aug 18, 2016 at 11:20

1 Answer 1

3

The output of a command is a byte stream. So you read it as a byte stream. It's up to the two programs to agree on the encoding to use.

For example:

  • If you execute a .NET (C#/VB.NET) console application, the application can use the Console.OutputEncoding, to set the encoding the Console.Write[Line] method will use.

    Console.OutputEncoding = Text.Encoding.UTF8;
    
  • Similarly, a PowerShell script can use the [Console]::OutputEncoding, to set the encoding the Write-Output or Write-Host cmdlets will use.

    [Console]::OutputEncoding = [Text.Encoding]::UTF8
    
  • The cmd.exe or a batch file can use the chcp command.

    chcp 65001
    
  • A Win32 application can use the SetConsoleOutputCP function to sets its output encoding, if it is using the WriteConsole. If the application is using the WriteFile, it just need to write the bytes encoded as desired already (e.g. using the WideCharToMultiByte).

    SetConsoleOutputCP(CP_UTF8);
    

When you then read the application output, you decode the byte stream using the agreed encoding. E.g. using the MultiByteToWideChar function.

Sign up to request clarification or add additional context in comments.

2 Comments

I think the last bullet should refer to WriteConsoleA. There seems little point in calling SetConsolveCP when you're calling WriteConsoleW. The WriteFile part is of correct; there's no such thing as WriteFileA/W because it writes binary data instead of text.
@MSalters I do not think you are right. The using the WriteConsoleA, you are using the OEM encoding to specify the output; and using the WriteConsoleW you are using the UTF-16 LE encoding. And the system converts either to the default encoding. But using the SetConsoleCP you can override either, to use e.g. UTF-8. How would you make your application output UTF-8 with the WriteConsoleW alone?

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.