2

It was tough coming up with an appropriate title for my question. First a little background information in case you need it.

*I have a bill that I am trying to read information off of using regexes. I save the information I need into 4 different tables: Account, Utility, Location, and Taxes.

The logic being that each bill has only one account number (account level). Each account number can pertain to multiple utilities (utility level). Each utility can have multiple locations (assume only 1 location for this question). and each location can have more than one Tax.*

So for the bill found HERE We can see that 4 Taxes (City Sales Tax of 2.97, County Sales Tax of 1.46, State Sales Tax of 3.44, and PPRTA Tax of 1.10) all belong to The 'Electric' Utility. We also see that 4 utilities (Electric, Gas, Water and Wastewater) belong to 1 Account Number, each with their own taxes.

Previously I have been doing something simple like this to capture all of the taxes in one capture group, multiple times: Tax:. \$(.)

What I am trying to accomplish now is to build a regex that Finds all of the taxes only for a given utility. Again, it must be in one capture group with multiple matches.

Here is an example of what I have so far for the Electric taxes: (?:Electric Commercial Service(?:.\n)?.?Tax:.* \$(.)(?:.\n)?.?Total charge this service)*

As you can see, this only picks up the first tax. I can not figure out a way to make it catch every tax between the words "Electric Commercial Service" and the "Total charge this service" pertaining to Electric service.

Thanks!

2 Answers 2

1

You can't do it in a single regex in most languages. A capture group will only result in one element in the match array, even if the group is wildcarded.

You need to do it in two steps. First use a regexp (or other means) to extract the portion of the bill for a single utility. Then within that string, you can use the regex

Tax:.* \$([\d.]+)$

to find all the taxes. In PHP, you'd use preg_match_all to find all the matches of this; other languages should have something comparable (maybe involving the g modifier to the regex).

Sign up to request clarification or add additional context in comments.

2 Comments

That's what I was afraid of. I am using this in a recently developed C# program that will need to be majorly overhauled if I have to break the text down into substrings. I appreciate the quick and thorough response.
Ended up using regexes to break the text into substrings based on utility and running my normal regexes foreach substring. thanks
1

It can be done as a one-liner, it was fun to do but it got ugly:

Gas Commercial Service \([\S\s]+?(?:[\s]+(?:(?:(?:[\w]+ )*)?(?:[\w]+)?Tax:[xX\d\.\%\s]*?\$[\d\.\s]*?\$([\d\.]*)\s*?))(?:[\s]+(?:(?:(?:[\w]+ )*)?(?:[\w]+)?Tax:[xX\d\.\%\s]*?\$[\d\.\s]*?\$([\d\.]*)\s*?))?(?:[\s]+(?:(?:(?:[\w]+ )*)?(?:[\w]+)?Tax:[xX\d\.\%\s]*?\$[\d\.\s]*?\$([\d\.]*)\s*?))?(?:[\s]+(?:(?:(?:[\w]+ )*)?(?:[\w]+)?Tax:[xX\d\.\%\s]*?\$[\d\.\s]*?\$([\d\.]*)\s*?))?(?:[\s]+(?:(?:(?:[\w]+ )*)?(?:[\w]+)?Tax:[xX\d\.\%\s]*?\$[\d\.\s]*?\$([\d\.]*)\s*?))?(?:[\s]+(?:(?:(?:[\w]+ )*)?(?:[\w]+)?Tax:[xX\d\.\%\s]*?\$[\d\.\s]*?\$([\d\.]*)\s*?))?

Explained demo here: http://regex101.com/r/fI7hU9

for Electric just change the first word

Updated to accept SurTax and alikes.

3 Comments

While this is an impressive regex I have a few concerns. First, a provider can ad an obscure tax at any time. ergo, I need to capture anything that even contains the word "Tax". I played with yours and when adding things like "Federal Gas SurTax 1% X $130.34 $1.30", it remains uncaptured. Second and less important, I need to only capture the amount, not the entire string, so the above mentioned federal gas surtax capture group would just be, "1.30"
Updated post and link, it's a mess but does the job. Glad you went the proper way though.
Your Regex skills are impressive! Had I not already re-written my code to accept substrings I would use the regex. Thanks for the Help!

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.