MD5 hashing is same in IOS and windows but different in java

Question

I am getting the same value for IOS and Winows md5 hashing but in the case of java i am getting a different value,

IOS code for md5 hashing

- (NSString*)md5HexDigest:(NSString*)input
{
    NSData *data = [input dataUsingEncoding:NSUTF16LittleEndianStringEncoding];
    unsigned char result[CC_MD5_DIGEST_LENGTH];
    CC_MD5([data bytes], (CC_LONG)[data length], result);

    NSMutableString *ret = [NSMutableString stringWithCapacity:CC_MD5_DIGEST_LENGTH*2];
    for(int i = 0; i<CC_MD5_DIGEST_LENGTH; i++) {
        [ret appendFormat:@"%02x",result[i]];
    }
    return ret;
}

Windows Code for md5 hashing

private static string GetMD5(string text)
        {
            UnicodeEncoding UE = new UnicodeEncoding();
            byte[] hashValue;
            byte[] message = UE.GetBytes(text);

            MD5 hashString = new MD5CryptoServiceProvider();
            string hex = "";

            hashValue = hashString.ComputeHash(message);
            foreach (byte x in hashValue)
            {
                hex += String.Format("{0:x2}", x);
            }
            return hex;
        }

Java Code for md5 hasing: Tried with UTF-8,16,32 , but not maching with the IOS and Windows

 public String MD5(String md5)  {
   try {

       String dat1 = md5.trim();
        java.security.MessageDigest md = java.security.MessageDigest.getInstance("MD5");
        byte[] array = md.digest(dat1.getBytes("UTF-16"));
        StringBuffer sb = new StringBuffer();
        for (int i = 0; i < array.length; ++i) {
          sb.append(Integer.toHexString((array[i] & 0xFF) | 0x100).substring(1,3));
       }
        System.out.println("Digest(in hex format):: " + sb.toString());
        return sb.toString();
    } catch (java.security.NoSuchAlgorithmException e) {
    }
   catch(UnsupportedEncodingException e)
   {
   }
    return null;
}

thanks

Does your weird Integer.toHexString() code actually give correct results? — Kayaman
– Kayaman, Commented Mar 24, 2015 at 14:19
@Kayaman Just curious: why does making use of UTF-16LE work here? What was wrong with the existing code by OP? What is the difference, other words? Thanks in advance. — Unheilig
– Unheilig, Commented Mar 24, 2015 at 14:34
@Unheilig Using just UTF-16 puts the BOM (0xFEFF or 0xFFFE) in the beginning, to specify the endianness. Using UTF-16BE or UTF-16LE explicitly leaves the BOM out (apparently). Unicode is a b*tch. — Kayaman
– Kayaman, Commented Mar 24, 2015 at 14:40

SubOptimal · Accepted Answer · 2015-03-24 15:15:27Z

1

Here a short overview what getBytes() returns related to the specified character set (all credits go to @Kayaman)

"123".getBytes("UTF-8")   :                31 32 33 
"123".getBytes("UTF-16")  : FE FF 00 31 00 32 00 33 
"123".getBytes("UTF-16LE"):       31 00 32 00 33 00 
"123".getBytes("UTF-16BE"):       00 31 00 32 00 33

It shows that the BOM is added only if the endianness is not specified. Then it depends on your architecture if LE or BE is used.

answered Mar 24, 2015 at 15:15

SubOptimal

23.1k3 gold badges57 silver badges70 bronze badges

Sign up to request clarification or add additional context in comments.

Collectives™ on Stack Overflow

MD5 hashing is same in IOS and windows but different in java

1 Answer 1

Comments

Your Answer

Hot Network Questions

Collectives™ on Stack Overflow

1 Answer 1

Comments

Your Answer

Sign up or log in

Post as a guest

Related