I got my last question marked as duplicated as question Which encoding does Process.getInputStream() use?. While actually that's not what I'm asking. In my second example, UTF-8 can successfully parse the special character. However, when the special character is read from the process input stream, it cannot be parsed correctly by UTF-8 anymore. Why does this happen and does that mean ISO_8859_1 is the only option I can choose.
I'm working on a plugin which can retrieve the Azure key vault secret in runtime. However, there's one encoding issue. I stored a string contains special character ç, the string is as follows: HrIaMFBc78!?%$timodagetwiçç99. However, with following program, the special character ç cannot be parsed correctly:
package com.buildingblocks.azure.cli;
import java.io.*;
import java.nio.charset.StandardCharsets;
public class Test {
static String decodeText(String command) throws IOException, InterruptedException {
Process p;
StringBuilder output = new StringBuilder();
p = Runtime.getRuntime().exec("cmd.exe /c \"" + command + "\"");
p.waitFor();
InputStream stream;
if (p.exitValue() != 0) {
stream = p.getErrorStream();
} else {
stream = p.getInputStream();
}
BufferedReader reader = new BufferedReader(new InputStreamReader(stream, StandardCharsets.UTF_8));
String line = "";
while ((line = reader.readLine()) != null) {
output.append(line + "\n");
}
return output.toString();
}
public static void main(String[] arg) throws IOException, InterruptedException {
System.out.println(decodeText("az keyvault secret show --name \"test-password\" --vault-name \"test-keyvault\""));
}
}
The output is: "value": "HrIaMFBc78!?%$timodagetwi��99"
If I use following program to parse the String, the special character ç can be parsed successfully.
package com.buildingblocks.azure.cli;
import java.io.BufferedReader;
import java.io.ByteArrayInputStream;
import java.io.IOException;
import java.io.InputStreamReader;
import java.nio.charset.Charset;
import java.nio.charset.StandardCharsets;
public class Test {
static String decodeText(String input, String encoding) throws IOException {
return
new BufferedReader(
new InputStreamReader(
new ByteArrayInputStream(input.getBytes()),
Charset.forName(encoding)))
.readLine();
}
public static void main(String[] arg) throws IOException {
System.out.println(decodeText("HrIaMFBc78!?%$timodagetwiçç99", StandardCharsets.UTF_8.toString()));
}
}
Both of them are using the BufferedReader with the same setup, but the one parsing the output from process failed. Does anybody know the reason for this?
Charset.defaultCharset()to get the encoding of the system. Don't use justISO_8859_1on all windows platforms.