1

So what I have now will (below) will search for XX-##-# and force it to XX-##-#0000.

How can I do this to return XX-##-0000#?

Is there a way to force 5 digits at the end, filling preceding 0s, to cover the other possibilities (XX-##-##, XX-##-###, XX-##-####)? As opposed to copying this 4 times, slightly adjusting for each.

$Pattern1 = '[a-zA-Z][a-zA-Z]-[0-9][0-9]-[0-9]'
Get-ChildItem 'C:\path\to\file\*.txt' -Recurse | ForEach {
     (Get-Content $_ | 
     ForEach  { $_ -replace $Pattern1, ('$1'+'0000')}) | 
     Set-Content $_
}

Thanks.

EDIT: I would like to do the following

Search           Replacement
XX-##-#          XX-##-0000#
XX-##-##         XX-##-000##
XX-##-###        XX-##-00###
XX-##-####       XX-##-0####
6
  • $Pattern1 or $date_Pattern1? Your $Pattern1 contains no capturing groups, and you refer to some with $1 - please show all your code. Commented Jun 7, 2017 at 15:55
  • @WiktorStribiżew my bad, it's supposed to be $Pattern1, I fixed it.. the $1 is a back-reference Commented Jun 7, 2017 at 15:58
  • Yeah, a backreference to a capturing group value. Your regex has no capturing groups. Commented Jun 7, 2017 at 15:59
  • @WiktorStribiżew oh.. i have to jump into something else at work but I think I understand what i'm missing now.. i'll give it another go later.. thanks Commented Jun 7, 2017 at 16:01
  • 1
    I might be too busy, but here is an approach you may pursue. Commented Jun 7, 2017 at 16:25

6 Answers 6

4

Unfortunately, PowerShell's -replace operator doesn't support passing an expression (script block) as the replacement string, which a succinct solution would require here.

However, you can use the appropriate [regex] .NET type's .Replace() method overload:

Note: This solution focuses just on the regex-based replacement part, but it's easy to embed it into the larger pipeline from the question.

# Define sample array.
$lines = @'
Line 1 AB-00-0 and also AB-01-1
Line 2 CD-02-22 after
Line 3 EF-03-333 it
Line 4 GH-04-4444 goes 
Line 5 IJ-05-55555 on
'@ -split "`n"

# Loop over lines...
$lines | ForEach-Object {
  # ... and use a regex with 2 capture groups to capture the substrings of interest
  #     and use a script block to piece them together with number padding
  #     applied to the 2nd group
  ([regex] '\b([a-zA-Z]{2}-[0-9]{2}-)([0-9]+)').Replace($_, { 
    param($match)
    $match.Groups[1].Value + '{0:D5}' -f [int] $match.Groups[2].Value
  })
}

The above yields:

Line 1 AB-00-00000 and also AB-01-00001
Line 2 CD-02-00022 after
Line 3 EF-03-00333 it
Line 4 GH-04-04444 goes 
Line 5 IJ-05-55555 on
Sign up to request clarification or add additional context in comments.

Comments

3

This is ancillary information that should help you fix your current code and come to the correct conclusion.

https://technet.microsoft.com/en-us/library/ee692795.aspx

I recommend utilizing a combination of the techniques listed in this documentation. The example provided is very helpful in numeric formatting:

$a = 348 
"{0:N2}" -f $a
"{0:D8}" -f $a
"{0:C2}" -f $a
"{0:P0}" -f $a
"{0:X0}" -f $a

Output
348.00
00000348
$348.00
34,800 %
15C

You can also utilize [String]::Format and add in some assertions to insure the item is formatted properly; If a specific value is not specified for example you could simply default it to 0.

https://blogs.technet.microsoft.com/heyscriptingguy/2013/03/11/understanding-powershell-and-basic-string-formatting/

Hope this helped.

Comments

1

Create capturing groups. Combine them while applying the formatting to the second.

Edit: Updated to remove the assumption that the line is only the matching string. Note, the assumption that there is only one match per line still exists.

$Pattern1 = '^(.*?)([a-zA-Z][a-zA-Z]-\d\d-)(\d+)(.*)$'
Get-ChildItem 'C:\path\to\file\*.txt' -Recurse |
    ForEach-Object {
        (Get-Content $_ | 
            ForEach-Object {
                if ($_ -match $Pattern1) {
                     "{0}{1}{2:D5}{3}" -f $matches[1],$matches[2],[int]$matches[3],$matches[4]
                } else {
                    $_
                }
            }) | Set-Content -Path $_
    }

Comments

0
#example of the text we'd load in from file / whereever
$value = @'
this is an example of a value to be replaced: 1AB-23-45 though
we may also want to replace 0CD-87-6 or even 9ZX-00-12345 that
'@

#regex to detect the #XX-##-##### pattern you mentioned (\b word boundaries included so we don't pick up these patterns if they're somehow part of a larger string; though that seems unlikely in this case) 
$pattern = '\b(\d[A-Z][A-Z])-(\d\d)-(\d{1,5})\b'
#what we want our output to look like; with placeholders 0, 1, & 2 taking values from our captures from the above.
$format = '{0}-{1}-{2:D5}'

<#
 # #use select-string to allow us to capture every match, rather than just the first
 # $value | Select-String -Pattern $pattern -AllMatches | %{
 #     #loop through every match replacing the matched string with the reformatted version of itself
 #     $_.matches | %{
 #         #NB: we have to convert the match in group#3 to int to ensure the {2:D3} formatting from above will be applied as expected
 #         $value = $value -replace "\b$($_.value)\b", ($format -f $_.Groups[1].Value,$_.Groups[2].Value,([int]$_.Groups[3].Value))
 #     }
 # }
#>

#or this version's a little more efficient; using the matched positions to replace the strings in those positions with our new formatted value
$value | Select-String -Pattern $pattern -AllMatches | %{
    $_.matches | sort index -Descending | %{
        $value = $value.remove($_.index, $_.value.length).insert($_.index, ($format -f $_.Groups[1].Value,$_.Groups[2].Value,([int]$_.Groups[3].Value)))
    }
}


$value

Comments

0

Use a look-ahead:

$Pattern1 = '(?<=[a-zA-Z][a-zA-Z]-\d\d-)(?=\d)(?!\d\d)'
Get-ChildItem 'C:\path\to\file\*.txt' -Recurse | ForEach {
(Get-Content $_ | 
ForEach  { $_ -replace $date_pattern1, ('000')}) | 
Set-Content $_

The look-ahead (?=\d) asserts, without consuming, that the next char is a digit.

The negative look-ahead '(!\d\d) asserts there are not 2 digits at the end, so you don't end up with XX-##-0000##.

Also note that \d (a "digit") is exactly the same as [0-9] (but easier to code.

I think you have to do the replacement 4 times.

3 Comments

thanks.. I was unclear about what I was trying to do.. please see the edit
\d is not exactly the same as [0-9] in .NET regex. Besides, I doubt '$1'+'0000' will work at all.
@WiktorStribiżew FYI it didn't work.. the input is still appreciated @ Bohemian
0

Simple example. Search numbers at the end of a string.

$text = 'aa-11-123'
$text -match '\d+$'  # sets $matches
$result = ($matches.0).padleft(5,'0')
$text -replace '\d+$', $result  # \d* won't work right

aa-11-00123

Comments

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.