PowerShell Unicode Characters Transforming Unexpectedly

Question

I've got a program that uses a few hash tables to resolve information. I'm getting some weird issues with foreign characters. Below is an accurate representation:

$Props =
@{
    P1  = 'Norte Americano e Inglês'
}

$Expressions =
@{
    E1  = { $Props['P1'] }
}

& $Expressions['E1']

If I paste this into PowerShell 5.1 console or run selection in VSCode I get:

Norte Americano e Inglês

As expected. But if I run the code in VSCose (hit F5). I get:

Norte Americano e InglÃªs

By debugging, setting a breakpoint right after the hash literal, I can tell the incorrect version is actually in the hash. So this isn't somehow a side effect of the call operator or the use of script blocks.

I attempted to set the output encoding like:

$OutputEncoding = [console]::InputEncoding = [console]::OutputEncoding = New-Object System.Text.UTF8Encoding

But this doesn't seem to change the pattern. Frankly, I'm surprised the console is handling Unicode so well in the first place. However, I can't understand the inconsistency. Ultimately this data is written to an AD attribute which again works fine if I execute the steps manually, but gets mangled if I actually run the script, even when the output encoding is set as previously mentioned.

I did look through this Q&A, but I don't seem to be having a console display issue, although that may be a result of the true type fonts. Perhaps they're masking the problem.

Interestingly it does seem to work correctly in VSCode if I switch it to PowerShell 7.1. However, because of integration with the AD cmdlets, which do not function well through implicit session compatibility, it's not possible to use PowerShell Core for this project.

The Dev environment is Windows 2012R2 up-to-date. I'm not sure there's an ability to change the system code page as is mentioned for Win 10 (1909).

Since the problem occurs with a string literal in your source code, the likeliest explanation is that your script file is misinterpreted by PowerShell, which happens if the script is saved as UTF-8 without a BOM. Try saving your script as UTF-8 with BOM; see this answer for more information. — mklement0
– mklement0, Commented Jun 4, 2021 at 18:24

Santiago Squarzon · Accepted Answer · 2021-03-19 02:36:15Z

1

This is pretty ugly but what happens if you try this at the end of your code:

$enc = [System.Text.Encoding]::UTF8
$enc.GetString($enc.GetBytes($(& $Expressions['E1'])))

Also, this might help you Encode a string in UTF-8

answered Mar 19, 2021 at 2:36

Santiago Squarzon

65.7k5 gold badges26 silver badges60 bronze badges

Sign up to request clarification or add additional context in comments.

4 Comments

Steven Over a year ago

That's definitely helping. Let me get this into the main code and close this issue out in the morning. This seems to be working too [System.Text.Encoding]::UTF8.GetString( [Char[]](& $Expressions['E1']) ) but I have to see it through to the output. THANKS!

Steven Over a year ago

This worked well under the circumstances. Though I would still like an explanation of the observed behavior. At any rate, I ran with your sample adjusted as mentioned and packed it in a function for convenience. Thanks again!

Santiago Squarzon Over a year ago

Glad it worked Steven. My guess is that by default PS std out seems to be UTF8-noBOM and we're forcing UTF8-BOM here. Though I thought by default PS was always using UTF8-BOM, this is strange behavior for me too :P

mklement0 Over a year ago

This answer is probably the right solution to the problem: if you save the script file as UTF-8 with BOM, Windows PowerShell no longer misinterprets it (PowerShell Core defaults to UTF-8, and therefore reads it correctly even without a BOM). Your solution attempt tries to fix the already-misinterpreted string after the fact, but this isn't a complete solution, because certain Unicode characters can break reading of the script altogether, such as an €.

Collectives™ on Stack Overflow

PowerShell Unicode Characters Transforming Unexpectedly

1 Answer 1

4 Comments

Your Answer

Linked

Hot Network Questions

Collectives™ on Stack Overflow

1 Answer 1

4 Comments

Your Answer

Sign up or log in

Post as a guest

Linked

Related