14

Editor's note: This code example is from a version of Rust prior to 1.0 and is not valid Rust 1.0 code, but the answers still contain valuable information.

I want to pass a string literal to a Windows API. Many Windows functions use UTF-16 as the string encoding while Rust's native strings are UTF-8.

I know Rust has utf16_units() to produce a UTF-16 character iterator, but I don't know how to use that function to produce a UTF-16 string with zero as last character.

I'm producing the UTF-16 string like this, but I am sure there is a better method to produce it:

extern "system" {
    pub fn MessageBoxW(hWnd: int, lpText: *const u16, lpCaption: *const u16, uType: uint) -> int;
}

pub fn main() {
    let s1 = [
        'H' as u16, 'e' as u16, 'l' as u16, 'l' as u16, 'o' as u16, 0 as u16,
    ];
    unsafe {
        MessageBoxW(0, s1.as_ptr(), 0 as *const u16, 0);
    }
}

3 Answers 3

27

Rust 1.8+

str::encode_utf16 is the stable iterator of UTF-16 values.

You just need to use collect() on that iterator to construct Vec<u16> and then push(0) on that vector:

pub fn main() {
    let s = "Hello";

    let mut v: Vec<u16> = s.encode_utf16().collect();
    v.push(0);
}

Rust 1.0+

str::utf16_units() / str::encode_utf16 is unstable. The alternative is to either switch to nightly (a viable option if you're writing a program, not a library) or to use an external crate like encoding:

extern crate encoding;

use std::slice;

use encoding::all::UTF_16LE;
use encoding::{Encoding, EncoderTrap};

fn main() {
    let s = "Hello";

    let mut v: Vec<u8> = UTF_16LE.encode(s, EncoderTrap::Strict).unwrap();
    v.push(0); v.push(0);
    let s: &[u16] = unsafe { slice::from_raw_parts(v.as_ptr() as *const _, v.len()/2) };
    println!("{:?}", s);
}

(or you can use from_raw_parts_mut if you want a &mut [u16]).

However, in this particular example you have to be careful with endianness because UTF_16LE encoding gives you a vector of bytes representing u16's in little endian byte order, while the from_raw_parts trick allows you to "view" the vector of bytes as a slice of u16's in your platform's byte order, which may as well be big endian. Using a crate like byteorder may be helpful here if you want complete portability.

This discussion on Reddit may also be helpful.

Sign up to request clarification or add additional context in comments.

2 Comments

wow, that works! thanks. Previously i use let mut v = s.utf16_units().collect::<u16>(); but the code failed to compile.
@GigihAjiIbrahim, it failed to compile because collect()'s type argument should be target collection, not element type. collect::<Vec<u16>>() would have worked too.
6

Rust 1.46+

For static UTF-16 strings, the utf16_lit crate provides an easy to use macro to do this at compile time:

use utf16_lit::utf16_null;

fn main() {
    let s = &utf16_null!("Hello");
    println!("{:?}", s);
}

Comments

6

If you are using literals, you can use the w macro from windows-sys: https://docs.rs/windows-sys/latest/windows_sys/macro.w.html

use windows_sys::w;

MessageBoxW(0, w!("Hello"), 0 as *const u16, 0);

2 Comments

I think in current rust version, this is the best answer if we dealing with windows API. Never thought that my question from 9 year ago still useful 🤣
I'm new to rust, but am using the windows crate and had no idea this existed! Fwiw, it's also available from windows::core::w

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.