0

I have a number of hex: 35 d8 de de de de 43 f2 71 84 4b f3 be 4d 4d 65 4a 17 41 bb 40 a5 85 c4 bd fd 7a 4e fb 24 27 4e

This is 32 bytes!

I do this:

String b = "35d8dededede43f271844bf3be4d4d654a1741bb40a585c4bdfd7a4efb24274e";
    byte[] bytes = fromHex(b);
    String st = new String(bytes, StandardCharsets.UTF_8);
    System.out.println(bytes.length);   // 32
    System.out.println(st.length());    // 30

  private static byte[] fromHex(String hex)
{
    byte[] binary = new byte[hex.length() / 2];
    for(int i = 0; i < binary.length; i++)
    {
        binary[i] = (byte)Integer.parseInt(hex.substring(2*i, 2*i+2), 16);
    }
    return binary;
}

And I get an answer:

32
30

But I expect to get a 32 UTF-8 character string! Why do I get a 30 character string? How can I get 32 UTF-8 bytes?

1
  • That 32-byte sequence DOES NOT represent a valid UTF-8 encoded string. For instance, the bytes d8 de de de de are not valid UTF-8. Where are you getting the hex string from exactly? Commented Jan 17, 2020 at 21:58

1 Answer 1

2

Why do I get a 30 character string?

There are byte sequences in that string such that multiple bytes are converted to a single Unicode codepoint when decoding from UTF-8.

How can I get 32 UTF-8 bytes.

We can't. It's a 30-character UTF-8 string?

And it's wrong anyway to say "UTF-8 bytes". They're not bytes any more.

Sign up to request clarification or add additional context in comments.

Comments

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.