3

I realize this question is probably idiotic, but hey, rough day. Anyway, given this:

scala> import java.nio.charset.Charset
import java.nio.charset.Charset

scala> val alpha = Array[Byte](2,-9,-7,-126,-36,-41,-16,56)
alpha: Array[Byte] = Array(2, -9, -7, -126, -36, -41, -16, 56)

scala> val beta = new String(alpha, Charset.forName("UTF-8"))
beta: String = ?������8

scala> val gamma = beta.getBytes(Charset.forName("UTF-8"))
gamma: Array[Byte] = Array(2, -17, -65, -67, -17, -65, -67, -17, -65, -67, -17, -65, -67, -17, -65, -67, -17, -65, -67, 56)

Why doesn't alpha == gamma? What's the correct way to do this?

Update: I see Base64 encoding/decoding works. But I am still interested in why UTF-8 doesn't. Perhaps it's because there is no UTF-8 representation of one or more those bytes.

0

1 Answer 1

2

UTF-8 uses one-to-four byte unsigned values. You would have to figure out what UTF-8 values you are actually getting when you underflow the values like that.

If you check new String(alpha) == new String(gamma), you will see that it returns true.

Sign up to request clarification or add additional context in comments.

Comments

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.