2

I am trying to identify errors in a log file. The application uses five uppercase letters followed by three digits followed by 'E' as an error code. The error code is followed by a non-word character. I was identifying cases with:

$errors=Select-string -Path "logfile.txt" -Pattern "[A-Z]{5}[0-9]{3}E\W"

However the remainder of the content now includes

ab1bea8a-a00e-4211-b1db-2facecfd725e.

Which is being matched by the regex and flagged as an error. I changed the regex to

\p{Lu}{5}[0-9]{3}E\W

(which I expected to match five upper case characters), but why does it still match the non-error lower case pattern?

1
  • 1
    You may make \p{Lu} insensitive to the outer regex settings by placing it into a modifier group like (?-i:\p{Lu}{5}[0-9]{3}E\W) Commented Jul 9, 2018 at 9:37

2 Answers 2

4

The "case-insensitive" regex flag is set by Select-String, which makes \p{Lu} case-insensitive, just as it does with [A-Z].

Try adding the -CaseSensitive parameter to the command.

You can confirm this by running some .NET code, for example in LINQPad:

(new Regex(@"\p{Lu}", RegexOptions.IgnoreCase)).IsMatch("a")
Sign up to request clarification or add additional context in comments.

2 Comments

You may also include this link to the answer. The detail about Select-String case-insensitive matching behavior is not easy find in the docs, although the -CaseSensitive option should hint at that by all means. Also, if one wants to mix case sensitivity options, it would be a good idea to add an example with modifier groups (not the case here, but would be good for generic cases).
Agreed, I thought about including that, but it wasn't a direct requirement of the question. And then I saw your comment, so that was taken care of anyway. :)
2

PowerShell regular expression matching is case-insensitive by default. There are several ways for making matches case-sensitive, though.

  • Add the -CaseSensitive switch when using the Select-String cmdlet:

    -CaseSensitive

    Makes matches case-sensitive. By default, matches are not case-sensitive.

    C:\> 'abc' | Select-String -Pattern 'A'
    
    abc
    
    C:\> 'ABC' | Select-String -Pattern 'A'
    
    ABC
    
    C:\> 'abc' | Select-String -Pattern 'A' -CaseSensitive    # ← no match here
    C:\> 'ABC' | Select-String -Pattern 'A' -CaseSensitive
    
    ABC
    
  • Use the case-sensitive version of the regular expression matching operators:

    By default, all comparison operators are case-insensitive. To make a comparison operator case-sensitive, precede the operator name with a c. For example, the case-sensitive version of -eq is -ceq. To make the case-insensitivity explicit, precede the operator with an i. For example, the explicitly case-insensitive version of -eq is -ieq.

    C:\> 'abc' -match 'A'
    True
    C:\> 'ABC' -match 'A'
    True
    C:\> 'abc' -cmatch 'A'    # ← no match here
    False
    C:\> 'ABC' -cmatch 'A'
    True
    
  • Force a case-sensitive match by adding a miscellaneous construct ((?...), not to be confused with non-capturing groups (?:...)) with the inverted "case-insensitive" regex option to the regular expression (this works with both Select-String cmdlet and -match operator):

    C:\> 'abc' | Select-String -Pattern '(?-i)A'    # ← no match here
    C:\> 'ABC' | Select-String -Pattern '(?-i)A'
    
    ABC
    
    C:\> 'abc' -match '(?-i)A'    # ← no match here
    False
    C:\> 'ABC' -match '(?-i)A'
    True
    

Comments

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.