3

I used this set of commands to check the sort command on the keyboard characters.

$symb="a","b","c","d","e","f","g","h","i","j","k","l","m","n","o","p","q","r","s","t","u","v","w","x","y","z","A","B","C","D","E","F","G","H","I","J","K","L","M","N","O","P","Q","R","S","T","U","V","W","X","Y","Z","²","1","2","3","4","5","6","7","8","9","0","°","+","&","é",'"',"'","(","-","è",[regex]::escape('`'),"_","ç","à",")","=","~","#","{","[","|","\","^","@","]","}","$","¨","ˆ","£","¤","ù","*","%","µ","<",",",";",":","!",">","?",".","/","§","€"; $symb|sort|ac file.txt;(gc file.txt)-join""

Here is what I get, both in a file and on the console.

'-!"#$%&()*,./:;?@[\]ˆ^_`{|}~¨£¤€+<=>§°µ012²3456789aAàbBcCçDdEeéèfFgGhHIiJjKkLlmMNnOoPpqQRrsStTuUùvVwWXxyYzZ

In about half the cases of pairs of lower- and uppercase letters the order is inverted; it seems it should always be "lowercase first, uppercase next". How can that be fixed?

2
  • 1
    Use the CaseSensitive parameter: sort -CaseSensitive. Documentation. Commented Jan 14, 2020 at 21:34
  • @kuujinbo It didn't occur to me at all that that was needed. Thanks, it works. You should turn your comment into an answer, there is no other way to add points to your account. Commented Jan 14, 2020 at 21:42

2 Answers 2

5

PowerShell - unlike direct use of .NET types - is case-insensitive by default; you need to opt in if you want case-sensitive behavior.

In the case of Sort-Object you need to use its -CaseSensitive switch:

PS> -join ('a', 'B', 'A', 'b' | Sort-Object -CaseSensitive)
aAbB

As you expected, this results in lowercase letters sorting first, because in the (US-English) collation order lowercase letters have lower sorting weight than uppercase ones - even though with respect to their Unicode code points the relationship is reversed (e.g., [int] [char] 'a' is 97, whereas [int] [char] 'A' is 65).

(Code-point-based sorting would apply if the array contained [char] instances, but PowerShell has no [char] literals, so a literal such as 'a' is a [string] of length 1; you can use explicit casts, however: -join ([char] 'A', [char] 'a' | Sort-Object -CaseSensitive) yields 'Aa', i.e. sorts uppercase first.)


Without -CaseSensitive, the lowercase and uppercase variants of a given letter have equal sorting weight, so no particular ordering among them is guaranteed.

For instance, the following loop exits quickly:

$prevResult = $null
while ($true) { 
  
  # Get a shuffled array of lower- and uppercase letters.
  $arr = 'a', 'B', 'A', 'b'
  $arr = $arr | Get-Random -Count $arr.Count
  
  # Sort it case-INsensitively.
  $result = -join ($arr | Sort-Object)

  $result # output

  # See if the result is different from the previous one.
  # Note the use of -cne rather than just -ne:
  # -ce is the case-*sensitive* variant of -ne
  if ($prevResult -and $prevResult -cne $result) {
    Write-Warning "Output order has changed."
    break
  }
  $prevResult = $result

} 

However, for a given input array, the two PowerShell editions differ with respect to sort stability, i.e. whether the input order of elements that sort the same is preserved:

  • In Windows PowerShell, Sort-Object is invariably not stable.

  • In PowerShell (Core) 7, Sort-Object now has a -Stable switch to request stable sorting, but - as of v7.4.x - it appears that sorting is stable by default (and, conversely, -Stable:$false does not opt-out). That said, to be future-proof and for conceptual clarity, it is better to specify -Stable explicitly when needed.

Here's a quick example that illustrates the difference:

# PowerShell 7 (as of v7.4.x works the same even without -Stable)
# -> 'AabBCc', i.e. input order was preserved.
-join ('A', 'b', 'C', 'a', 'B', 'c' | Sort-Object -Stable)

# Windows PowerShell (no -Stable switch)
# -> !! 'aABbcC', i.e. the input order was *not* preserved.
-join ('A', 'b', 'C', 'a', 'B', 'c' | Sort-Object)
Sign up to request clarification or add additional context in comments.

1 Comment

Thanks, @iRon: I think it comes down to Sort-Object not being stable in Windows PowerShell (invariably so), whereas in PowerShell 7 not only can you request stable sorting explicitly with -Stable, it appears to be performing a stable sort by default. Please see my update.
1

The .net way does not have this problem.

$symb = "a", "b", "c", "d", "e", "f", "g", "h", "i", "j", "k", "l", "m", "n", "o", "p", "q", "r", "s", "t", "u", "v", "w", "x", "y", "z", "A", "B", "C", "D", "E", "F", "G", "H", "I", "J", "K", "L", "M", "N", "O", "P", "Q", "R", "S", "T", "U", "V", "W", "X", "Y", "Z", "²", "1", "2", "3", "4", "5", "6", "7", "8", "9", "0", "°", "+", "&", "é", '"', "'", "(", "-", "è", [regex]::escape('`'), "_", "ç", "à", ")", "=", "~", "#", "{", "[", "|", "\", "^", "@", "]", "}", "$", "¨", "ˆ", "£", "¤", "ù", "*", "%", "µ", "<", ",", ";", ":", "!", ">", "?", ".", "/", "§", "€"; 
[Array]::Sort($symb)
$symb

2 Comments

Thanks for this information. Do you have any idea of what the problem would be?
No but since Powershell is case insensitive about everywhere, I assume their implementation of Sort is some respect the same and does behave indifferently in-between Uppser / Lower case, leading to some inconsistency there... That's just assumption based though. Their implementation is here : github.com/PowerShell/PowerShell/blob/master/src/… (although things might have been fixed in the current code for PS7)

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.