1

Firstly I'm trying to understand this. Second I would like to use it.

 # test string
$pgNumString = 'C:\test\test5\AALTONEN-ALLAN_PENCARROW_PAGE_1.txt'

# Regex with capture group for number '1' ONLY from $pgNumString
# In other use cases it may be page 10 or any page in 100s
$pgNumRegex = "(?s)_(\d+)\."

# Simplest - not using -SimpleMatch because this example uses regex (Select-String docs)
$pgNum = $pgNumString | Select-String -Pattern $pgNumRegex -AllMatches 

The match is not assigned to $pgNum. No capture grouping means no good anyway. A slightly more sophisticated attempt:

$pgNum = $pgNumString | Select-String -Pattern $pgNumRegex -AllMatches | Select-Object {$_.Matches.Groups[1].Value} 

Output:

$_.Matches.Groups[1].Value
--------------------------
1

The match is still not assigned to $pgNum. But the output shows I'm on the right track. What am I doing wrong?

3
  • 2
    Change Select-Object to ForEach-Object Commented Jan 5, 2023 at 19:32
  • Yes Santiago. Thank you. ForEach-Object gives me the correct output. Commented Jan 6, 2023 at 3:52
  • 1
    @Dave, something I didn't spell out in my answer: you only need -AllMatches if there can be multiple matches per input string. That is, if you feed text line by line to Select-String (as happens when you pass a file path to -Path / -LiteralPath, for instance, or provide input via Get-Content without using -Raw) and you only ever expect at most one match per line, you do not need -AllMatches. In cases where you do need -AllMatches, $_.Matches.Groups[1].Value is not sufficient to extract all matches, as explained in my answer. Commented Jan 6, 2023 at 4:21

1 Answer 1

2

Especially if you're dealing with strings already in memory, but often also with files (except if they're exceptionally large), use of Select-String isn't necessary and both slows down and complicates the solution, as your example shows.

While -match works in principle too - to focus on matching only what should be extracted - it is limited to one match, whose results are reflected in the automatic $Matches variable.

However, you can make direct use of an underlying .NET API, namely [regex]::Matches().

# Sample input.
$pgNumString = @'
C:\test\test5\AALTONEN-ALLAN_PENCARROW_PAGE_1.txt
C:\test\test6\AALTONEN-ALLAN_PENCARROW_PAGE_42.txt
'@

# -> '1', '42'
# Note: To match PowerShell's case-*insensitive* behavior (not relevant here), use:
#  [regex]::Matches($pgNumString, '(?<=_)\d+(?=\.)', 'IgnoreCase').Value
[regex]::Matches($pgNumString, '(?<=_)\d+(?=\.)').Value

As an aside:

  • Bringing the functionality of [regex]::MatchAll() natively to PowerShell in the future, in the form of a -matchall operator, is the subject of GitHub issue #7867.

Note that I've modified your regex to use look-around assertions so that what it captures consists solely of the substring to extract, reflected in the .Value property.
For an explanation of the regex and the ability to experiment with it, see this regex101.com page.

Using your original approach requires extra work to extract the capture-group values, with the help of the intrinsic .ForEach() method:

[regex]::Matches($pgNumString, '_(\d+)\.').ForEach({ $_.Groups[1].Value })

As for what you tried:

As Santiago notes, you need to use ForEach-Object instead of Select-Object, but there's an additional requirement:

Given your use of -AllMatches, you need to access .Groups[1].Value on each of the matches reported in .Matches, otherwise you'll only get the first match's capture-group value:

$pgNumString | 
  Select-String -Pattern $pgNumRegex -AllMatches |
  ForEach-Object { $_.Matches.ForEach({ $_.Groups[1].Value }) }

As an aside:

  • Making Select-String only return the matching parts of the input lines / strings, via an -OnlyMatching switch is a green-lit future enhancement - see GitHub issue #7712

  • While this wouldn't directly help with capture groups, it is usually possible to reformulate regexes with look-around assertions, as shown with [regex]::Matches() above.

Sign up to request clarification or add additional context in comments.

1 Comment

Thank you mklement0. A lot to unpack here.

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.