1

I've got the following text:

instance=hostname1, topic="AB_CD_EF_12345_ZY_XW_001_000001"
instance=hostname2, topic="AB_CD_EF_1345_ZY_XW_001_00001"
instance=hostname1, topic="AB_CD_EF_1235_ZY_XW_001_000001"
instance=hostname2, topic="AB_CD_EF_GH_4567_ZY_XW_01_000001"
instance=hostname1, topic="AB_CD_EF_35678_ZY_XW_001_00001"
instance=hostname2, topic="AB_CD_EF_56789_ZY_XW_001_000001"

I would like to capture numbers from the sample above. I've tried to do so with the regular expressions below and they work well as separate queries:

Regex: *.topic="AB_CD_EF_([^_]+).*    
Matches: 12345 1345 1235

Regex: *.topic="AB_CD_EF_GH_([^_]+).*
Matches: 4567 35678 56789

But I need a regex which can give me all numbers, ie:

12345 1345 1235 4567 35678 56789
3
  • From your description, it seems that your input is splitted into multiple lines? I have edited accordingly but in case it is not, please voice out and we can revert it back Commented Jun 25, 2019 at 0:47
  • 3
    You are not giving clear requirement nor the problem you are facing: can it be simply be "finding number between "_EF_" / "_GH" and _ZY ? That could be done by /.*(?:_EF_|_GH_)(\d+)_ZY.*/ Commented Jun 25, 2019 at 0:49
  • How about (?<=EF_)(\d+)(?=_ZY), Demo: regex101.com/r/vbLN9L/6 Commented Jun 25, 2019 at 0:50

4 Answers 4

2

Make GH_ optional:

.*topic="AB_CD_EF_(GH_)?([^_]+).*

which matches all your target numbers.

See live demo.


You could be more general by allowing any number of "letter letter underscore" sequences using:

.*topic="(?:[A-Z]{2}_)+([^_]+).*

See live demo.

Sign up to request clarification or add additional context in comments.

Comments

1

Another option that we might call, would be an expression similar to:

topic=".*?[A-Z]_([0-9]+)_.*?"

and our desired digits are in this capturing group ([0-9]+).

Please see the demo for additional explanation.

1 Comment

Hi Bohemian, Thank you for sharing the link , it helped . I am using the regex in grafana and regex was different there . The link helped me a lot . Hi Emma, thank you for the pointer it worked . And rest of the folks , all your solutions were working on regex browser but was facing issue in Grafana, because it accepts only first grouping . but with all your help, was able to find the right regex . cheers
0

From the examples and conditions you've given I think you're going to need a very restrictive regex, but this may depend on how you want to adapt it. Take a look at the following regex and read the breakdown for more information on what it does. Use the first group (there is only one in this regex) as a substitution to retrieve the numbers you are looking for.

Regex

^instance\=hostname[0-9]+\,\s*topic\=\“[A-Z_]+([0-9]+)_[A-Z_]+[0-9_]+\”$

Try it out in this DEMO.

Breakdown

^                # Asserts position at start of the line
hostname[0-9]+   # Matches any and all hostname numbers
\s*              # Matches whitespace characters (between 0 and unlimited times)
[A-Z_]+          # Matches any upper-case letter or underscore (between 1 and unlimited times)
([0-9]+)         # This captures the number you want
$                # Asserts position at end of the line

Although this does answer the question you have asked I fear this might not be exactly what you're looking for but without further information this is the best I can give you. In any case after you've studied the breakdown and played around the demo a it it should prove to be of some help.

Comments

0

The regex worked for me :

/.*topic="(?:[AB_CD_EF_(GH_)]{2,3}_)+([^_]]+).*/

1 Comment

You should accept your own answer so as to mark the question as being answered. Read more about why this is encouraged in this blog post named It’s OK to Ask and Answer Your Own Questions.

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.