0

Experts, I am stuck. I have a string with various patterns separated by commas. I need to validate two things: (1) that each of patterns matches zero or more of the comma separated strings, and (2) that there are no strings that do not match a pattern. Each string element is/will be separated with a comma (nothing else) and there may be trailing spaces after the commas (which I know I can remove with substitution before the validation step!)

Steps taken so far:

Split string (by comma) into an array of string elements String patterns to match:

(a) single numbers e.g. 1,2,300,5,7,80 (up to 4 digits) etc.   
(b) ranges e.g. 1-5, 23-45,45-23 (up to 4 digits either side) etc.
(c) r1-r50, r45-r4 (up to 4 digits)
(d) 1-z, z-100
(e) string which contains one of two patterns 12-34:odd and 34-4:even

What I would like is to pull the groups of pattern matching strings directly into an array through regexp comparison on the original string, rather than splitting it into an array (which does of course work!).

So what regexp(s) would I need to filter for each potential pattern and extract the matching strings elements by looking at the original string?

This is not urgent as I have a working version by splitting them into elements, but as a learning exercise, I am unclear how I can construct and apply regexp to the string with commas.

Stretch question: How can I quickly identify there are string elements which DO NOT follow one or more of the patterns.

Thanks

What I have so far is:

(a) ^\d{0,5}*$
(b) ^\d{0,5}-\d{0,5}*$
(c) (?:z)
(d) (?:r)
(e) (?:odd|even)

so code extract:

$reg1 = '^\d{0,5}*$'    
$reg2 = '^\d{0,5}-\d{0,5}*$'
$reg3 = '(?:z)'           
$reg4 = '(?:r)'           
$reg5 = '(?:odd|even)'    

This is applied to the string split into array elements.

$phase1 = $rangearray | Select-String $reg1 -AllMatches | %{$_.Line} | sort # single numbers only
if($phase1.count -gt 0) { $Allpages[0].Single = $true; $Allpages[1].Single = @($phase1); }

$phase1 = $rangearray | Select-String $reg2 -AllMatches | %{$_.Line} # number ranges
if($phase1.count -gt 0) { $Allpages[0].Ranged = $true; $Allpages[1].Ranged = {$phase1}.Invoke(); }

$phase1 = $rangearray | Select-String $reg3 -AllMatches | %{$_.Line} # max range option z
if($phase1.count -gt 0) { $Allpages[0].Z = $true; $Allpages[1].Z = @($phase1); }

$phase1 = $rangearray | Select-String $reg5 -AllMatches | %{$_.Line} # odds and evens
if($phase1.count -gt 0) { $Allpages[0].OddEven = $true; $Allpages[1].OddEven = @($phase1); }

This function is passed a string. Examples of that string are below:

$range = '1-2,12-14,27-25,300-270,r10-r15,450-470:odd'
$range = '4, 2, 5, 14-12,16-18,20-19,r3-r1,285-290,r7-r4,388-z'
3
  • 2
    Splitting first and then parsing each item individually sounds like a much more reasonable approach TBH Commented Nov 2, 2023 at 12:22
  • Post sample of file. There may be easier ways of accomplishing the desired results. Commented Nov 2, 2023 at 13:19
  • Samples posted. Also, perhaps my expressions could be checked. I believe they work fine on single string pattern matches. But Im open for improvements... Commented Nov 2, 2023 at 15:16

3 Answers 3

0

I suggest using a switch statement with its -Regex switch, taking advantage of its ability to iterate over array-valued input, and its ability to produce multiple outputs that you can collect in an array:

# Sample input string.
$range = ' 4, 2, 5, 14-12,16-18,20-19,r3-r1,285-290,r7-r4,388-z, 450-470:odd, WRONG'

[string[]] $validTokens = 
  switch -Regex (($range -split ',').Trim()) {
    '^\d{1,4}(?:-\d{1,4}(:(?:odd|even))?)?$' { $_; continue } # e.g. '4', '14-12', '450-470:odd'
    '^r\d{1,4}-r\d{1,4}$' { $_; continue } # e.g. 'r3-r1'
    '^z-\d{1,4}$' { $_; continue } # e.g. '388-z'
    '^\d{1,4}-z$' { $_; continue } # e.g. '1-z'
    default { Write-Warning "Doesn't match any expected pattern: '$_'" }
  }
  • ($range -split ',').Trim() splits the $range string by , and removes surrounding whitespace from each resulting element.

  • Each element is then tested against the regexes that form the branch conditionals and, if they match one, are passed through ($_), and processing moves on to the next element (continue).

  • Only if none of the regexes match is the default branch reached.

To process multiple ranges, enclose the above in a foreach statement or a ForEach-Object call.

Sign up to request clarification or add additional context in comments.

Comments

0

Here is a reference.

"... (a) single numbers e.g. 1,2,300,5,7,80 (up to 4 digits) etc. ..."

Considering, check for this last.

\d{1,4}

"... (b) ranges e.g. 1-5, 23-45,45-23 (up to 4 digits either side) etc.
(e) string which contains one of two patterns 12-34:odd and 34-4:even ..."

\d{1,4}-\d{1,4}(?::odd|even)?

"... (c) r1-r50, r45-r4 (up to 4 digits) ..."

r\d{1,4}-r\d{1,4}

"... (d) 1-z, z-100 ..."

\d{1,4}-z|z-\d{1,4}

"... So what regexp(s) would I need to filter for each potential pattern and extract the matching strings elements by looking at the original string? ..."

Merge these; here is a match pattern.

(?:r\d{1,4}-r\d{1,4}|\d{1,4}-\d{1,4}(?::odd|even)?|\d{1,4}-z|z-\d{1,4})|\d{1,4}

"... I need to validate two things: (1) that each of patterns matches zero or more of the comma separated strings, and (2) that there are no strings that do not match a pattern. ..."

You'll need to validate the string first.

I presume you can use the ^ and $ syntax to assert the start and end of the string.
Here is an example.

If p is the above pattern, this equates to, ^p(?:,\s*p)*$

^(?:(?:r\d{1,4}-r\d{1,4}|\d{1,4}-\d{1,4}(?::odd|even)?|\d{1,4}-z|z-\d{1,4})|\d{1,4})(?:,\s*(?:(?:r\d{1,4}-r\d{1,4}|\d{1,4}-\d{1,4}(?::odd|even)?|\d{1,4}-z|z-\d{1,4})|\d{1,4}))*$

From here, use the initial pattern to match each value.

Comments

0

The description of the regular expression:

((?:r\d{1,4}-r\d{1,4})|(?:\d{1,4}-\d{1,4}:(?:odd|even))|(?:z-\d{1,4})|(?:\d{1,4}-z)+|(?:\d{1,4}-\d{1,4})|(?:\d){1,4})(?:,|\s|$)

Its not required to operate on the string before applying Regex. You may directly apply this regex to get the required valid values.

Please try adding/removing/modifying the last part (?:,|\s|$) as required.

[1]: A numbered capture group. [(?:r\d{1,4}-r\d{1,4})|(?:\d{1,4}-\d{1,4}:(?:odd|even))|(?:z-\d{1,4})|(?:\d{1,4}-z)+|(?:\d{1,4}-\d{1,4})|(?:\d){1,4}]
      Select from 6 alternatives
          Match expression but don't capture it. [r\d{1,4}-r\d{1,4}]
              r\d{1,4}-r\d{1,4}
                  r
                  Any digit, between 1 and 4 repetitions
                  -r
                  Any digit, between 1 and 4 repetitions
          Match expression but don't capture it. [\d{1,4}-\d{1,4}:(?:odd|even)]
              \d{1,4}-\d{1,4}:(?:odd|even)
                  Any digit, between 1 and 4 repetitions
                  -
                  Any digit, between 1 and 4 repetitions
                  :
                  Match expression but don't capture it. [odd|even]
                      Select from 2 alternatives
                          odd
                              odd
                          even
                              even
          Match expression but don't capture it. [z-\d{1,4}]
              z-\d{1,4}
                  z-
                  Any digit, between 1 and 4 repetitions
          Match expression but don't capture it. [\d{1,4}-z], one or more repetitions
              \d{1,4}-z
                  Any digit, between 1 and 4 repetitions
                  -z
          Match expression but don't capture it. [\d{1,4}-\d{1,4}]
              \d{1,4}-\d{1,4}
                  Any digit, between 1 and 4 repetitions
                  -
                  Any digit, between 1 and 4 repetitions
          Match expression but don't capture it. [\d], between 1 and 4 repetitions
              Any digit
  Match expression but don't capture it. [,|\s|$]
      Select from 3 alternatives
          ,
          Whitespace
          End of line or string

Comments

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.