7

Im trying to compare two different string encoded by sha512. But, result is different. It can be an encode problem i mean. I hope you can help me.

This is my Java code:

    MessageDigest digest = java.security.MessageDigest.getInstance("SHA-512"); 
    digest.update(MyString.getBytes()); 
    byte messageDigest[] = digest.digest();

    // Create Hex String
    StringBuffer hexString = new StringBuffer();
    for (int i = 0; i < messageDigest.length; i++) {
        String h = Integer.toHexString(0xFF & messageDigest[i]);
        while (h.length() < 2)
            h = "0" + h;
        hexString.append(h);
    }
    return hexString.toString();

and, this is my C# code:

        UnicodeEncoding UE = new UnicodeEncoding();
        byte[] hashValue;
        byte[] message = UE.GetBytes(MyString);

        SHA512Managed hashString = new SHA512Managed();
        string hex = "";

        hashValue = hashString.ComputeHash(message);
        foreach (byte x in hashValue)
        {
            hex += String.Format("{0:x2}", x);

        }
        return hex;

Where is the problem ? Thx much guys

UPDATE

If i don't specify encoding type, it supposes Unicode i think. Result is this (without specifying anything):

Java SHA: a99951079450e0bf3cf790872336b3269da580b62143af9cfa27aef42c44ea09faa83e1fbddfd1135e364ae62eb373c53ee4e89c69b54a7d4d268cc2274493a8

C# SHA: 70e6eb559cbb062b0c865c345b5f6dbd7ae9c2d39169571b6908d7df04642544c0c4e6e896e6c750f9f135ad05280ed92b9ba349de12526a28e7642721a446aa

Instead, if i specify UTF-16 in Java:

Java UTF-16: SHA f7a587d55916763551e9fcaafd24d0995066371c41499fcb04614325cd9d829d1246c89af44b98034b88436c8acbd82cd13ebb366d4ab81b4942b720f02b0d9b

It's always different !!!

6
  • What happens when you specify the encoding in MyString.getBytes()? (Bad variable name, btw.) Commented Feb 24, 2012 at 18:29
  • 1
    Have you compared the bytes of MyString before computing hash? Commented Feb 24, 2012 at 18:30
  • It would be nice to provide us full code samples and your input/output as well. Commented Feb 24, 2012 at 18:32
  • 1
    Your encoding types are different. Commented Feb 24, 2012 at 18:37
  • I hope this isn't used for password hashing... Commented Feb 24, 2012 at 18:50

3 Answers 3

6

The UnicodeEncoding in C# you use corresponds to the little-endian UTF-16 encoding, while "UTF-16" in Java corresponds to the big-endian UTF-16 encoding. Another difference is that C# doesn't output the Byte Order Marker (called "preamble" in the API) if you don't ask for it, while "UTF-16" in Java generates it always. To make the two programs compatible you can make Java also use the little-endian UTF-16:

digest.update(MyString.getBytes("UTF-16LE"));

Or you could switch to some other well known encoding, like UTF-8.

Sign up to request clarification or add additional context in comments.

Comments

6

Here,

digest.update(MyString.getBytes()); 

you should be explicitly specifying the desired character encoding in String#getBytes() method. It will otherwise default to the platform default charset as is been obtained by Charset#defaultCharset().

Fix it accordingly:

digest.update(MyString.getBytes("UTF-16LE")); 

It should at least be the same charset as UnicodeEncoding is internally using.


Unrelated to the concrete problem, Java has also an enhanced for loop and a String#format().

1 Comment

UnicodeEncoding is apparently using UTF-16LE. I've updated the answer.
3

The reason is probably that you did not specify the encoding to use when converting the string to bytes, java uses the platform default encoding, while UnicodeEncoding seems to use utf-16.

Edit:

The documentation for UnicodeEncoding says

This constructor creates an instance that uses the little endian byte order, provides a Unicode byte order mark, and does not throw an exception when an invalid encoding is detected.

Javas "utf-16" however seems to default to big endian byte order. With character encodings its better to be really specific, there is an UnicodeEncoding constructor taking two boolean specifiyng byte order, while in java there is also "utf-16le" and "utf-16be". You could try the following in c#

new UnicodeEncoding(true, false) // big endian, no byte order mark

and in java

myyString.getBytes("utf-16be")

Or even better use "utf-8" / Encoding.UTF8 in both cases since it is not affected by different byteorders.

1 Comment

Nope. Different result occurs however

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.