0

We send sales invoices with a prefix of INV. Some customers pay the invoices without the prefix. I want to add the prefix in that case to have the ERP-system recognise these payments. An example of the date is below.

:61:2204210421C1339,57NMSCTOPF2510474511//GBBK031SCT TOPF2510474511
:86:RGT FACT 17133 TANQ BROERS SA/RGT39370 TANQ BROERS SA48 AVENUE D'ABCDE :61:2204270427C4808,37NMSCTOPF2520477320//GBJ6009SCT
TOPF2520477320 :86:RGT FACT 17274.17442.17546 TANQ BROERS SA/RGT39370 TANQ BROERS SA48 AVENUE D'ABCDE
:61:2203290329C5518,16NMSCTOPF2485471711//GBCJ001SCT TOPF2485471711 :86:RGT FACT.16794 16918 17079 TANQ BROERS SA/RGT39370 TANQ BROERS SA48 AVENUE D'ABCDE

I need the output to be:

:61:2204210421C1339,57NMSCTOPF2510474511//GBBK031SCT TOPF2510474511 :86:RGT FACT INV17133 TANQ BROERS SA/RGT39370 TANQ BROERS SA48 AVENUE D'ABCDE
:61:2204270427C4808,37NMSCTOPF2520477320//GBJ6009SCT TOPF2520477320
:86:RGT FACT INV17274.INV17442.INV17546 TANQ BROERS SA/RGT39370 TANQ BROERS SA48 AVENUE D'ABCDE :61:2203290329C5518,16NMSCTOPF2485471711//GBCJ001SCT TOPF2485471711 :86:RGT FACT.INV16794 INV16918 INV17079 TANQ BROERS SA/RGT39370 TANQ BROERS SA48 AVENUE D'ABCDE

I have made this script but it only matches the first invoicenumber. How do I match them all in a group?

(:61:[0-9]{1,6}[0-9]{4}C[0-9]+\,[0-9]?[0-9]?)(NMSC.+)(\r?\n:86:RGT FACT.{1})([\d]{5})

The payment description will always be similar but not always exactly like this. The order will be like this but I am not sure whether they allways use dots to seperate the invoicenumbers for example.

6
  • 2
    You can repeat matching the 5 digits matching either a dot or space in between (:61:[0-9]{1,6}[0-9]{4}C[0-9]+,[0-9]{0,2})(NMSC.+)(\r?\n:86:RGT FACT[ .])(\d{5}(?:[. ]\d{5})*) regex101.com/r/VKkYI3/1 Commented May 20, 2022 at 10:56
  • Thanks but how do I add INV then to all instances instead of only the first one? Commented May 20, 2022 at 11:00
  • 1
    You could split the values of the last group by either a space or a dot and then prepend INV There could be another option using \G like (?:(:61:[0-9]{1,6}[0-9]{4}C[0-9]+,[0-9]{0,2})(NMSC.+)(\r?\n:86:RGT FACT)|\G(?!^))[ .](\d{5}) regex101.com/r/wYrxQ3/1 Commented May 20, 2022 at 11:01
  • Does every line start with :61 and are these lines in real life without newlines in between? The format you used in the question does not make that clear... Commented May 20, 2022 at 12:28
  • Actually each payment (in the Swift MT940 standard) has two lines starting with :61: (date, amount, debit or credit) and :86: (payment description, bank account, bic/swift code). So the relevant line is actually indeed :86: which always starts with :86: and always runs until :61: (=next payment). In real life there can be new lines in between so the bank sometimes cuts the :86: into mutliple lines for some reason but my scripts puts everything on its own line so that each line starts with :61: or :86: or [one of the other MT940 codes which I did not mention here as not relevant]. Commented May 20, 2022 at 13:24

2 Answers 2

1

You might use the \G anchor to get contiguous matches for the last 5 digits separated by either a space or dot.

Note that you can omit {1} from the pattern, and omit the . after FACT as that becomes part of the repeating using \G

(?:(:61:[0-9]{1,6}[0-9]{4}C[0-9]+,[0-9]{0,2})(NMSC.+)(\r?\n:86:RGT FACT)|\G(?!^))[ .](\d{5})

Explanation

  • (?: Non capture group
    • (:61:[0-9]{1,6}[0-9]{4}C[0-9]+,[0-9]{0,2})(NMSC.+)(\r?\n:86:RGT FACT) Your initial pattern
    • | OR
    • \G(?!^) Assert the position at the end of the previous match (not at the start)
  • ) Close the alternation
  • [ .](\d{5}) Match either a space or dot and capture 5 digits

See a Powershell demo and a regex demo.

Example

$input = @"
:61:2204210421C1339,57NMSCTOPF2510474511//GBBK031SCT TOPF2510474511
:86:RGT FACT 17133 TANQ BROERS SA/RGT39370 TANQ BROERS SA48 AVENUE D'ABCDE
:61:2204270427C4808,37NMSCTOPF2520477320//GBJ6009SCT TOPF2520477320
:86:RGT FACT 17274.17442.17546 TANQ BROERS SA/RGT39370 TANQ BROERS SA48 AVENUE D'ABCDE
:61:2203290329C5518,16NMSCTOPF2485471711//GBCJ001SCT TOPF2485471711
:86:RGT FACT.16794 16918 17079 TANQ BROERS SA/RGT39370 TANQ BROERS SA48 AVENUE D'ABCDE
"@

$input -replace '(?:(:61:[0-9]{1,6}[0-9]{4}C[0-9]+,[0-9]{0,2})(NMSC.+)(\r?\n:86:RGT FACT)|\G(?!^))[ .](\d{5})' ,'$1$2$3 INV$4'

Output

:61:2204210421C1339,57NMSCTOPF2510474511//GBBK031SCT TOPF2510474511
:86:RGT FACT INV17133 TANQ BROERS SA/RGT39370 TANQ BROERS SA48 AVENUE D'ABCDE
:61:2204270427C4808,37NMSCTOPF2520477320//GBJ6009SCT TOPF2520477320
:86:RGT FACT INV17274 INV17442 INV17546 TANQ BROERS SA/RGT39370 TANQ BROERS SA48 AVENUE D'ABCDE
:61:2203290329C5518,16NMSCTOPF2485471711//GBCJ001SCT TOPF2485471711
:86:RGT FACT INV16794 INV16918 INV17079 TANQ BROERS SA/RGT39370 TANQ BROERS SA48 AVENUE D'ABCDE
Sign up to request clarification or add additional context in comments.

Comments

0

@Thefourthbird gave the answer in the comments, it works perfectly and I used following script to adjust the MT940 statement and put it to Production:

(Get-Content -Path $fileName -Raw) -replace '(?:(:61:[0-9]{1,6}[0-9]{4}C[0-9]+,[0-9]{0,2})(NMSC.+)(\r?\n:86:RGT FACT)|\G(?!^))[ .](\d{5})' ,'$1$2$3 INV$4' | Set-Content  -Path $fileName

It takes the MT940 text file and puts INV in front of a 5 digit number if the string starts with :86:RGT FACT.

Comments

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.