String SHA-512 Encoding: C# and JAVA result is different

Question

Im trying to compare two different string encoded by sha512. But, result is different. It can be an encode problem i mean. I hope you can help me.

This is my Java code:

    MessageDigest digest = java.security.MessageDigest.getInstance("SHA-512"); 
    digest.update(MyString.getBytes()); 
    byte messageDigest[] = digest.digest();

    // Create Hex String
    StringBuffer hexString = new StringBuffer();
    for (int i = 0; i < messageDigest.length; i++) {
        String h = Integer.toHexString(0xFF & messageDigest[i]);
        while (h.length() < 2)
            h = "0" + h;
        hexString.append(h);
    }
    return hexString.toString();

and, this is my C# code:

        UnicodeEncoding UE = new UnicodeEncoding();
        byte[] hashValue;
        byte[] message = UE.GetBytes(MyString);

        SHA512Managed hashString = new SHA512Managed();
        string hex = "";

        hashValue = hashString.ComputeHash(message);
        foreach (byte x in hashValue)
        {
            hex += String.Format("{0:x2}", x);

        }
        return hex;

Where is the problem ? Thx much guys

UPDATE

If i don't specify encoding type, it supposes Unicode i think. Result is this (without specifying anything):

Java SHA: a99951079450e0bf3cf790872336b3269da580b62143af9cfa27aef42c44ea09faa83e1fbddfd1135e364ae62eb373c53ee4e89c69b54a7d4d268cc2274493a8

C# SHA: 70e6eb559cbb062b0c865c345b5f6dbd7ae9c2d39169571b6908d7df04642544c0c4e6e896e6c750f9f135ad05280ed92b9ba349de12526a28e7642721a446aa

Instead, if i specify UTF-16 in Java:

Java UTF-16: SHA f7a587d55916763551e9fcaafd24d0995066371c41499fcb04614325cd9d829d1246c89af44b98034b88436c8acbd82cd13ebb366d4ab81b4942b720f02b0d9b

It's always different !!!

What happens when you specify the encoding in MyString.getBytes()? (Bad variable name, btw.) — Hauke Ingmar Schmidt
– Hauke Ingmar Schmidt, Commented Feb 24, 2012 at 18:29
Have you compared the bytes of MyString before computing hash? — L.B
– L.B, Commented Feb 24, 2012 at 18:30
It would be nice to provide us full code samples and your input/output as well. — wkl
– wkl, Commented Feb 24, 2012 at 18:32

Joni · Accepted Answer · 2012-02-24 23:45:04Z

6

The UnicodeEncoding in C# you use corresponds to the little-endian UTF-16 encoding, while "UTF-16" in Java corresponds to the big-endian UTF-16 encoding. Another difference is that C# doesn't output the Byte Order Marker (called "preamble" in the API) if you don't ask for it, while "UTF-16" in Java generates it always. To make the two programs compatible you can make Java also use the little-endian UTF-16:

digest.update(MyString.getBytes("UTF-16LE"));

Or you could switch to some other well known encoding, like UTF-8.

edited Feb 24, 2012 at 23:45

answered Feb 24, 2012 at 21:42

Joni

112k14 gold badges151 silver badges201 bronze badges

Sign up to request clarification or add additional context in comments.

Comments

BalusC · Accepted Answer · 2012-02-25 04:33:55Z

6

Here,

digest.update(MyString.getBytes());

you should be explicitly specifying the desired character encoding in String#getBytes() method. It will otherwise default to the platform default charset as is been obtained by Charset#defaultCharset().

Fix it accordingly:

digest.update(MyString.getBytes("UTF-16LE"));

It should at least be the same charset as UnicodeEncoding is internally using.

Unrelated to the concrete problem, Java has also an enhanced for loop and a String#format().

edited Feb 25, 2012 at 4:33

answered Feb 24, 2012 at 18:38

BalusC

1.1m377 gold badges3.7k silver badges3.6k bronze badges

1 Comment

BalusC Over a year ago

UnicodeEncoding is apparently using UTF-16LE. I've updated the answer.

Jörn Horstmann · Accepted Answer · 2012-02-24 21:39:18Z

3

The reason is probably that you did not specify the encoding to use when converting the string to bytes, java uses the platform default encoding, while UnicodeEncoding seems to use utf-16.

Edit:

The documentation for UnicodeEncoding says

This constructor creates an instance that uses the little endian byte order, provides a Unicode byte order mark, and does not throw an exception when an invalid encoding is detected.

Javas "utf-16" however seems to default to big endian byte order. With character encodings its better to be really specific, there is an UnicodeEncoding constructor taking two boolean specifiyng byte order, while in java there is also "utf-16le" and "utf-16be". You could try the following in c#

new UnicodeEncoding(true, false) // big endian, no byte order mark

and in java

myyString.getBytes("utf-16be")

Or even better use "utf-8" / Encoding.UTF8 in both cases since it is not affected by different byteorders.

edited Feb 24, 2012 at 21:39

answered Feb 24, 2012 at 18:33

Jörn Horstmann

34.1k11 gold badges77 silver badges122 bronze badges

1 Comment

kinghomer Over a year ago

Nope. Different result occurs however

Collectives™ on Stack Overflow

String SHA-512 Encoding: C# and JAVA result is different

3 Answers 3

Comments

1 Comment

1 Comment

Your Answer

Linked

Hot Network Questions

Collectives™ on Stack Overflow

3 Answers 3

Comments

1 Comment

1 Comment

Your Answer

Sign up or log in

Post as a guest

Linked

Related