13

I found the following code compiles and works:

func foo(p:UnsafePointer<UInt8>) {
    var p = p
    for p; p.memory != 0; p++ {
        print(String(format:"%2X", p.memory))
    }
}

let str:String = "今日"
foo(str)

This prints E4BB8AE697A5 and that is a valid UTF8 representation of 今日

As far as I know, this is undocumented behavior. from the document:

When a function is declared as taking a UnsafePointer argument, it can accept any of the following:

  • nil, which is passed as a null pointer
  • An UnsafePointer, UnsafeMutablePointer, or AutoreleasingUnsafeMutablePointer value, which is converted to UnsafePointer if necessary
  • An in-out expression whose operand is an lvalue of type Type, which is passed as the address of the lvalue
  • A [Type] value, which is passed as a pointer to the start of the array, and lifetime-extended for the duration of the call

In this case, str is non of them.

Am I missing something?


ADDED:

And it doesn't work if the parameter type is UnsafePointer<UInt16>

func foo(p:UnsafePointer<UInt16>) {
    var p = p
    for p; p.memory != 0; p++ {
        print(String(format:"%4X", p.memory))
    }
}
let str:String = "今日"
foo(str)
//  ^ 'String' is not convertible to 'UnsafePointer<UInt16>'

Even though the internal String representation is UTF16

let str = "今日"
var p = UnsafePointer<UInt16>(str._core._baseAddress)
for p; p.memory != 0; p++ {
    print(String(format:"%4X", p.memory)) // prints 4ECA65E5 which is UTF16 今日
}
4
  • It seems to be it is the last one, no? Commented Nov 21, 2014 at 14:30
  • I think, no. String is not Array<UInt8> Commented Nov 21, 2014 at 14:32
  • I meant to say the penultimate one. It is just like a in-out variable. Maybe the wording "which is passed" is not clear. It could mean "this is how the function will interpret this argument" (which I think is meant) or "this is what you have to pass in", (which I think is not meant here). Commented Nov 21, 2014 at 14:35
  • it is now documented "A String value, if Type is Int8 or UInt8. The string will automatically be converted to UTF8 in a buffer, and a pointer to that buffer is passed to the function" developer.apple.com/library/content/documentation/Swift/… Commented May 4, 2017 at 10:20

1 Answer 1

10

This is working because of one of the interoperability changes the Swift team has made since the initial launch - you're right that it looks like it hasn't made it into the documentation yet. String works where an UnsafePointer<UInt8> is required so that you can call C functions that expect a const char * parameter without a lot of extra work.

Look at the C function strlen, defined in "shims.h":

size_t strlen(const char *s);

In Swift it comes through as this:

func strlen(s: UnsafePointer<Int8>) -> UInt

Which can be called with a String with no additional work:

let str = "Hi."
strlen(str)
// 3

Look at the revisions on this answer to see how C-string interop has changed over time: https://stackoverflow.com/a/24438698/59541

Sign up to request clarification or add additional context in comments.

3 Comments

Thanks! nice. According to swiftc -emit-sil outputs, it actually creates temporarily Array<UInt> from String.UTF8View.Generator. It looks not so fast...
Huh. Well, the "I" in SIL stands for intermediate, right? Depending on how strings are actually implemented in the compiled runtime (what if they're just char* under the hood?), that might be a no-op.
@rintaro: As of Swift 5 the underlying character storage is (null-terminated) UTF-8. One reason for the change was to make passing Swift strings to C functions efficient.

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.