78

How can I encode the Unicode character U+0048 (H), say, in a PowerShell string?

In C# I would just do this: "\u0048", but that doesn't appear to work in PowerShell.

2
  • What's your output encoding set to? ($OutputEncoding) Commented Jun 29, 2009 at 5:34
  • 1
    It's us-ascii. But U+0048 should be encodable in that. I'm actually trying to encode an escape character (U+001B). Commented Jun 29, 2009 at 6:46

7 Answers 7

100

Replace '\u' with '0x' and cast it to System.Char:

PS > [char]0x0048
H

You can also use the "$()" syntax to embed a Unicode character into a string:

PS > "Acme$([char]0x2122) Company"
AcmeT Company

Where T is PowerShell's representation of the character for non-registered trademarks.

Note: this method works only for characters in Plane 0, the BMP (Basic Multilingual Plane), chars < U+10000.

Sign up to request clarification or add additional context in comments.

5 Comments

You can even write a little function: function C($n) {[char][int]"0x$n"}. Which you can use in a string as follows: "$(C 48)ello World." Not ideal but probably a little closer to the \u escape.
This also works when you want to pass a unicode [char] to a function. Thanks for the help.
I know this topic is 2.5 years old, but following up on @Joey's comment, you can even make a function called \u. It's identical to Joey's, just with a different name. So the function is function \u($n) {[char][int]"0x$n"}. The way you call it is just like C# except that you need a space between the function name and the number. So \u 0048 returns H.
This only works for characters in BMP, else it triggers an error. Eg. [char]0x1D400: InvalidArgument: Cannot convert value "119808" to type "System.Char". Error: "Value was either too large or too small for a character."
@noraj The reason this only works for characters in the BMP is that .NET’s char type represents UTF-16 code units, and for BMP characters, 1 character = 1 code unit, but for non-BMP characters, 1 character = 2 code units. /// @chris the \u function could be extended to work with non-BMP characters.
35

According to the documentation, PowerShell Core 6.0 adds support with this escape sequence:

PS> "`u{0048}"
H

see https://learn.microsoft.com/en-us/powershell/module/microsoft.powershell.core/about/about_special_characters?view=powershell-6#unicode-character-ux

Comments

17

Maybe this isn't the PowerShell way, but this is what I do. I find it to be cleaner.

[regex]::Unescape("\u0048") # Prints H
[regex]::Unescape("\u0048ello") # Prints Hello

Comments

6

For those of us still on 5.1 and wanting to use the higher-order Unicode charset (for which none of these answers work) I made this function so you can simply build strings like so:

'this is my favourite park ',0x1F3DE,'. It is pretty sweet ',0x1F60A | Unicode

enter image description here

#takes in a stream of strings and integers,
#where integers are unicode codepoints,
#and concatenates these into valid UTF16
Function Unicode {
    Begin {
        $output=[System.Text.StringBuilder]::new()
    }
    Process {
        $output.Append($(
            if ($_ -is [int]) { [char]::ConvertFromUtf32($_) }
            else { [string]$_ }
        )) | Out-Null
    }
    End { $output.ToString() }
}

Note that getting these to display in your console is a whole other problem, but if you're outputting to an Outlook email or a Gridview (below) it will just work (as utf16 is native for .NET interfaces).

enter image description here

This also means you can also output plain control (not necessarily unicode) characters pretty easily if you're more comfortable with decimal since you dont actually need to use the 0x (hex) syntax to make the integers. 'hello',160,'there' | Unicode would put a non-breaking space betwixt the two words, the same as if you did 0xA0 instead.

4 Comments

[char]::ConvertFromUtf32 has been available since .NET 2.1 so you don't need such a complex function
oh neat. The function is still necessary, I'm not writing [char]blahblahblah whenever I want a "`u{}", but it does simplify the if
besides $_ -shr 11 should be used instead of [int][math]::Floor($_ / 0x400), and ($_ -band 0x3FF) -bor 0xDC00 instead of [char]($_ % 0x400 + 0xDC00)
I s'pose that's obvious since it was a nice even hex number, oh well. Doesn't matter now that .NET can handle the overarching problem
4

To make it work for characters outside the BMP you need to use Char.ConvertFromUtf32()

'this is my favourite park ' + [char]::ConvertFromUtf32(0x1F3DE) + 
'. It is pretty sweet ' + [char]::ConvertFromUtf32(0x1F60A)

In PowerShell 6.0 or newer you can also use `u{}

"this is my favourite park `u{1F3DE}. It is pretty sweet `u{1F60A}"

This special character was added in PowerShell 6.0.

The Unicode escape sequence (`u{x}) allows you to specify any Unicode character by the hexadecimal representation of its code point. This includes Unicode characters above the Basic Multilingual Plane (> 0xFFFF) which includes emoji characters such as the thumbs up (`u{1F44D}) character. The Unicode escape sequence requires at least one hexadecimal digit and supports up to six hexadecimal digits. The maximum hexadecimal value for the sequence is 10FFFF.

Unicode character (`u{x})

1 Comment

Yep, with emoji's in powershell you would need 2 surrogate characters in a row stackoverflow.com/a/70057239/6654942.
2

Another way using PowerShell.

$Heart = $([char]0x2665)
$Diamond = $([char]0x2666)
$Club = $([char]0x2663)
$Spade = $([char]0x2660)
Write-Host $Heart -BackgroundColor Yellow -ForegroundColor Magenta

Use the command help Write-Host -Full to read all about it.

1 Comment

Shay Levy's answer above already showed how to use [char]0x2665. In fact this is far more inefficient because you create a new subshell for each variable instead of assigning directly: $Heart = [char]0x2665
0

Note that some characters like 🌎 might need a "double rune" to be printed:

PS> "C:\foo\bar\$([char]0xd83c)$([char]0xdf0e)something.txt"

Will print:

C:\foo\bar\🌎something.txt

You can find these "runes" here, in the "unicode escape" row: https://dencode.com/string

1 Comment

no need for such a complex manual lookup method. My answer already shows many solutions before you. `` "C:\foo\bar`u{1F30E}something.txt" `` or "C:\foo\bar\" + [char]::ConvertFromUtf32(0x1F30E) + "something.txt" will work

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.