I have a JSON String from which I am making a InputStream object as shown below and then I am making a GenericRecord object as I am trying to serialize my JSON object to Avro schema.
InputStream input = new ByteArrayInputStream(jsonString.getBytes());
DataInputStream din = new DataInputStream(input);
Decoder decoder = DecoderFactory.get().jsonDecoder(schema, din);
DatumReader<GenericRecord> reader = new GenericDatumReader<GenericRecord>(schema);
// below line is throwing exception
GenericRecord datum = reader.read(null, decoder);
Below is the exception I am getting:
org.codehaus.jackson.JsonParseException: Invalid UTF-8 middle byte 0x2d at [Source: java.io.DataInputStream@562aee31; line: 1, column: 74]
And here is the actual JSON string on which this exception is happening:
{"name":"car_test","attr_value":"2006|Renault|Megane II Coupé-Cabriolet|null|null|null|null|0|Wed Feb 03 10:00:59 GMT-07:00 2016|1|77|null|null|null|null","data_id":900}
I did some research and found out that I need to use ByteArrayInputStream with UTF-8 encodings as shown below:
InputStream input = new ByteArrayInputStream(jsonString.getBytes(StandardCharsets.UTF_8.displayName()));
But my question is what is the reason of this exception? And why it is happening on my above JSON String? I am just trying to understand why this exception is happening on my above JSON String. And using UTF-8 is the right fix for this?
What does this error means Invalid UTF-8 middle byte 0x2d?