0

Using regex how do we extract multiple substrings inside of a string?

Suppose we have this:

resgrp/providers/Microsoft.Storage/storageAccounts/vvvvvdgdevstor","subject":"/blobServices/default/containers/coloradohhhhready/blobs/README_.._.hl7","eventType":"Microsoft.Storage.BlobCreated","eventTime":"2019-06-19T17:28:40.3136657Z","id":"604ad6c5a0145-04c4-26bsssss26a","data":{"api":"PutBlockList","clientRequestId":"aaaaaaae-4e68-95f6-c1ssssb02f"

The result I'd like is:

/coloradohhhhready/README_.._.hl7

What I've tried is:

(?i)(?<=\/containers\/)(.*)(?=\/blobs\/)(.*)(?<=\/blobs\/)(.*)(?=","eventtype)

Which yielded:

coloradohhhhready/blobs/README_.._.hl7

I would simply want to remove the /blobs/ segment inside of that string:

enter image description here

5
  • If you don't match the / after /blobs in your pattern, you could use the first and third capturing group (?i)(?<=\/containers\/)(.*)(?=\/blobs)(.*)(?<=\/blobs)(.*)(?=","eventtype) Perhaps you could update your pattern to (?i)(?<=\/containers)(/[^/]+)/blobs(/[^"/]+)(?=","eventtype") and use group 1 and group 2. Commented Jun 28, 2019 at 14:31
  • 3
    Just post-process it: match = match.Replace("/blobs/", "/"). Anyway, you cannot match discontinuous text within one match operation into a single group. Lookarounds are not meant to "make holes" in the texts you mach. Commented Jun 28, 2019 at 14:31
  • @WiktorStribiżew is this not possible to do with pure regex? Commented Jun 28, 2019 at 14:34
  • 1
    I think it is clear from my comment. Commented Jun 28, 2019 at 14:37
  • from your comment Anyway, you cannot match discontinuous text within one match operation into a single group. Lookarounds are not meant to "make holes" in the texts you mach. you are saying that this is simply not possible to do with regex. am i understanding correctly? thanks so much Commented Jun 28, 2019 at 14:38

3 Answers 3

2

If you know you will always want to remove /blobs/, then simply replace the whole thing after with a /.

On the other hand, pasting your solution on Regex101 showed that the match of your epxression yields 3 groups, one of which is /blobs/. Thus, in your case it would be as simple as reconstructing another string by doing: "/" + Group[1].Value + "/" + Group[3].Value.

Sign up to request clarification or add additional context in comments.

3 Comments

another words you're saying that this is not possible to do with pure regex?
im interested in a purely regex solution
the reason this had a .net tag was because i wanted to ensure that the regex was .net-compliant, but since that was confusing, i've removed .net tag
1

If you want to match it using a regex, you could use 2 capturing groups and match instead of using lookarounds what comes before, after and /blobs in the middle.

In the capturing group (/[^/]+) match a forward slash followed by matching not a /

Your values are in capturing group 1 and group 2.

(?i)(?:/containers)(/[^/]+)/blobs(/[^"/]+)(?:","eventtype")

Regex demo | .NET C# example

2 Comments

full match is returning unwanted /containers/ and /blobs/ strings: drive.google.com/uc?id=1aMlln70rbrKY4cBHtPY1m6kwvuxQcScy
thank you i see that, however this is not a purely regex solution Console.WriteLine(m.Groups[1].Value + m.Groups[2].Value); i dont want to involve any other code besides regex, not c#
0

My guess is that this expression,

(?:\/blobs\/)|(.*?)

would be maybe an option to start.

Demo

2 Comments

thanks again for your help, emma, another words this is not possible to do with pure regex?
that's a really cool solution, however, it looks like it's capturing everything except for /blobs drive.google.com/uc?id=15mxMKImot48Lj-e_o4wxJSqaKXkVTjLS

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.