0

I'm trying to regex a file. I have tried these but I'm not good with regex.

  • ((\|\n.*|\n))\d.*\n\s.*[0-9]{1,3}\s
  • ((\|\n.*|\n))\d\d\d\d\d\d\d\n\s\s\s\s\s\s\s\s\s\s[0-9]{1,3}\s
  • ((\|\n.*|\n))\d{7,8}\n\s.*[0-9]{1,3}\s
  • \|\n\s.*\d{7}\n\s.*[0-9]{1,3}\s
  • ^.*\|\r?\n.*\r?\n[0-9]{1,3}$

I have a file that has lines like these

  $00.00|0.00|0.00|||
  8360657
  68694

What I'm trying to do is figure out is the 3rd line is between 1 and 3 digits. If it's longer than 3 digits I don't care about it.

There is a lot more data in this file, and for each occurance of the above 3 lines I want to know all matches if the 3rd line in my example is 3 digits or less. How can I modify my regex to work?

Here is my example code of what I've tried:

$file = "C:\Users\user\Desktop\del2\file.le"
$content = gc $file -raw
$gRegex = "((\|\n.*|\n))\d{7,8}\n\s.*[0-9]{1,3}\s"
$content -match $guarantorRegex

I have got these to match using regex101.com however I'm not getting this to work in powershell...


What worked for me in the end:

$file = "C:\Users\user\Desktop\del2\D2341202.le"
$content = gc $file -raw
$guarantorRegex = "\|\r?\n[ ]{10}.*\r?\n[ ]{10}[0-9]{1,3}\s"
$content | select-string -Pattern $gRegex -AllMatches | % { $_.Matches } | % { $_.Value } > "C:\Users\user\Desktop\matches.txt"
10
  • 1
    Is your code going to always match those three lines exactly as they are? With whitespace as well? What is going to be consistent between the three line occurrences, and what may differ? Commented Dec 18, 2019 at 0:00
  • 1
    Try ^.*\|\r?\n.*\r?\n[0-9]{1,3}$ regex101.com/r/lttbzU/1 Commented Dec 18, 2019 at 0:02
  • 1
    At regex101.com, all line endings are \n, your file must have Windows line endings, CRLF. Commented Dec 18, 2019 at 0:04
  • 1
    Then perhaps like ^[ ]{10}.*\|\r?\n[ ]{10}.*\r?\n[ ]{10}[0-9]{1,3}$ regex101.com/r/1w8BJP/1 Commented Dec 18, 2019 at 0:05
  • 2
    I think it should work demo. Perhaps with the multiline inline modifier (?m)^[ ]{10}.*\|\r?\n[ ]{10}.*\r?\n[ ]{10}[0-9]{1,3}\$ Commented Dec 18, 2019 at 0:17

1 Answer 1

1

If you want to match 10 spaces, you could match a space with a quantifier [ ]{10}

(The square brackets are for clarity only)

(?m)^[ ]{10}.*\|\r?\n[ ]{10}.*\r?\n[ ]{10}[0-9]{1,3}\$
  • (?m) Inline modifier to enable multiline
  • ^ Start of line
  • [ ]{10}.*\| Match 10 spaces, 1+ times any char except a newline and |
  • \r?\n[ ]{10}.* Match a newline, 10 spaces, 1+ times any char except a newline
  • \r?\n[ ]{10}[0-9]{1,3} Match a newline, 10 spaces 3 digits 0-9
  • $ End of line

Regex demo

Note that \s will also match a newline.

If you want to match whitespaces except a newline you could use [^\S\r\n]{10}


If you don't want to use anchors and there is a whitespace char at the end, you could use the pattern that worked for you

\|\r?\n[ ]{10}.*\r?\n[ ]{10}[0-9]{1,3}\s
Sign up to request clarification or add additional context in comments.

Comments

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.