0

I have raw data in the format:

<&70><+><&10><+>.002<&70><+>B<&70><+>A<+><&90>

I would like to use VB to extract the contexts between < and > and store them in an array, such as {&70,+,&10,+, ... }

Then I would like to do the same for the items NOT between the brackets, also storing in an array like {.002, B, A}

Does anyone know how I can do this using regex?

2
  • 1
    Here's a doc on regex's in VB, support.microsoft.com/en-us/kb/818802 and a regex that should pull the data you need, <.+?>. I don't know regex syntax in VB so can't offer much more right now. Commented Jun 17, 2015 at 21:09
  • @chris85 your link is about VB6 not VB.Net. here is the Regex class documentation Commented Jun 17, 2015 at 21:29

2 Answers 2

1

Storing in an array could be difficult upfront because you don't know which size they will be ; but you can use List(Of String) instead (and as final step call the ToArray extension method on them if you really want)

Using the regex given by sln (why is way better than what I was doing) you need to loop through all the matches in your data using Regex.Matches (or it's instance counterpart if you want to create [and compile] the regex upfront)
And then if the current match has the indicated group (1 or 2 as shown in sln's regex explanation) add it to the corresponding list.

Dim rawData As String = "<&70><+><&10><+>.002<&70><+>B<&70><+>A<+><&90>"

Dim betweenBrackets As New List(Of String)
Dim outsideBrackets As New List(Of String)

For Each m As Match In Regex.Matches(rawData, "<([^>]*)>|((?:(?!<[^>]*>)[\S\s])+)")
    If m.Groups(1).Success Then betweenBrackets.Add(m.Groups(1).Value)
    If m.Groups(2).Success Then outsideBrackets.Add(m.Groups(2).Value)
Next
Sign up to request clarification or add additional context in comments.

4 Comments

Thank you @Sehnsucht! How can I then access the elements of the lists in VB? When I try to access an item using Return betweenBrackets.Item(0) I get an error.
@DanielWang yeah that's my fault I swapped the arguments of Regex.Matches by mistake so the code ran fine but obviously there was no match at all (hence no item in the list) ; I edit my answer to correct. As a side note you can access directly items inside a list using the indexer (like an array) betweenBrackets(0) [it's exactly the same thing as using Item(0)]
Thanks @Sehnsucht! When I try outputting outsideBrackets(0) I get "+>" instead of ".002". Do you know how I could fix this?
@DanielWang not really; when i execute the code I don't have what you describe but the expected result
0

I would do both operations at the same time in a global search.
If group 1 matched, push into bracket array,
if group 2 matched, push into non-bracket array.

 # <([^>]*)>|((?:(?!<[^>]*>)[\S\s])+)

   < 
   ( [^>]* )                     # (1)
   >
|  
   (                             # (2 start)
        (?:
             (?! < [^>]* > )
             [\S\s] 
        )+
   )                             # (2 end)

1 Comment

Could you explain how exactly this works and how I can incorporate it into VB?

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.