I have noticed that if I try to print the byte array containing the representation of a string in UTF-8, using the format specifier "%s", printf() gets it right but NSLog() gets it garbled (i.e., each byte printed as-is, so for example "¥" gets printed as the 2 characters: "¬•").
This is curious, because I always thought that NSLog() is just printf(), plus:
- The first parameter (the 'format') is an Objective-C string, not a C string (hence the "@").
- The timestamp and app name prepended.
- The newline automatically added at the end.
- The ability to print Objective-C objects (using the format "%@").
My code:
NSString* string;
// (...fill string with unicode string...)
const char* stringBytes = [string cStringUsingEncoding:NSUTF8Encoding];
NSUInteger stringByteLength = [string lengthOfBytesUsingEncoding:NSUTF8Encoding];
stringByteLength += 1; // add room for '\0' terminator
char* buffer = calloc(sizeof(char), stringByteLength);
memcpy(buffer, stringBytes, stringByteLength);
NSLog(@"Buffer after copy: %s", buffer);
// (renders ascii, no matter what)
printf("Buffer after copy: %s\n", buffer);
// (renders correctly, e.g. japanese text)
Somehow, it looks as if printf() is "smarter" than NSLog(). Does anyone know the underlying cause, and if this feature is documented anywhere? (Couldn't find)
NSLogan UTF8 string asNSString, not as a C string. But in this particular case, I wanted to check that thecharbuffer had been copied right.