5

I'm looking to create a VBA regular expression that will find the existence of two particular strings inside a set of parentheses.

For example, in this expression:

(aaa, bbb, ccc, ddd, xxx aaa)

it should somehow tell me that it found both "aaa" AND "xxx aaa" in the expression. I.e, since there is a match on "aaa" without the "xxxx " in front, and there is also a match on "xxx aaa" later on in the expression, it should return true. Since these two sequences can appear in either order, the reverse should also be true.

So I'm thinking the expression/s would be something like this:

"(xxx aaa"[^x][^x][^x][^x]aaa)"

to find the words in one order and

"(aaa"[^x][^x][^x][^x]xxx aaa)"

for the words in another order.

Does this make sense? Or is there a better approach?

I know this is changing the spec, but there is one important addendum - there cannot be any interceding parentheses between the terms.

So for example, this should't match:

(aaa, bbb, ccc, ddd, (eee, xxx aaa))

In other words I'm trying to look in between a matching set of parentheses only.

3
  • Can you remove the parentheses and call Split to separate the entries into an array that you can search? Commented Jul 30, 2012 at 15:11
  • couldn't you just use the InStr function for this? You could just use a boolean variable or something and set it to true if it finds a location for the phrase you're looking for in the string? InStr function found here: msdn.microsoft.com/en-us/library/8460tsh1(v=vs.80).aspx Commented Jul 30, 2012 at 15:11
  • I tried to answer your question best as possible, but you are unclear in your problem definition. a) Regex will never have a notion of "matching parentheses". It's technically impossible. b) You seem to assume that , is some kind of separator, but you never really define that. Commented Jul 30, 2012 at 18:18

2 Answers 2

1

Zero-width look-ahead asserttions are your friend.

Function FindInParen(str As String, term1 As String, term2 As String) As Boolean
  Dim re As New VBScript_RegExp_55.RegExp

  re.Pattern = "\(" & _
               "(?=[^()]*)\)" & _
               "(?=[^()]*\b" & RegexEscape(term1) & "\b)" & _
               "(?=[^()]*\b" & RegexEscape(term2) & "\b)"

  FindInParen = re.Test(str)
End Function

Function RegexEscape(str As String) As String
  With New VBScript_RegExp_55.RegExp
    .Pattern = "[.+*?^$|\[\](){}\\]"
    .Global = True
    RegexEscape = .Replace(str, "\$&")
  End With
End Function

This pattern reads as:

  • Starting from an opening paren, check:
    • that a matching closing paren follows somewhere and no nested parens inside
    • that term1 occurs before the closing paren
    • that term2 occurs before the closing paren

Since I'm using look-ahead ((?=...)), the regex engine never actually moves forward on the string, so I can chain as many look-ahead assertions and have them all checked. A side-effect is that the order in which term1 and term2 occur in the string doesn't matter.

I tested it on the console ("Immediate Window"):

? FindInParen("(aaa, bbb, ccc, ddd, xxx aaa)", "aaa", "xxx aaa")
True

? FindInParen("(aaa, bbb, ccc, ddd, (eee, xxx aaa))", "aaa", "xxx aaa")
True

? FindInParen("(aaa, bbb, ccc, ddd, (eee, xxx aaa))", "bbb", "xxx aaa")
False

Notes:

  • The second test yields True because—technically—both aaa and xxx aaa are inside the same set of parens.
  • Regex cannot deal with nested structures. You will never get nested parentheses right with regular expressions. You will never be able to find "a matching set of parens" with regex alone - only an opening/closing pair that has no other parens in-between. Write a parser if you need to handle nesting.
  • Make a reference to "Microsoft VBScript Regular Expressions 5.5" in your project.

FWIW, here's a minimal nesting-aware function that works for the second test case above:

Function FindInParen(str As String, term1 As String, term2 As String) As Boolean
  Dim parenPair As New VBScript_RegExp_55.RegExp
  Dim terms As New VBScript_RegExp_55.RegExp
  Dim matches As VBScript_RegExp_55.MatchCollection

  FindInParen = False
  parenPair.Pattern = "\([^()]*\)"
  terms.Pattern = "(?=.*?[(,]\s*(?=\b" & RegexEscape(Trim(term1)) & "\b))" & _
                  "(?=.*?[(,]\s*(?=\b" & RegexEscape(Trim(term2)) & "\b))"

  Do
    Set matches = parenPair.Execute(str)
    If matches.Count Then
      If terms.Test(matches(0).Value) Then
        Debug.Print "found here: " & matches(0).Value
        FindInParen = True
      End If
      str = parenPair.Replace(str, "[...]")
    End If
  Loop Until FindInParen Or matches.Count = 0

  If Not FindInParen Then
    Debug.Print "not found"
  End If

  If InStr("(", str) > 0 Or InStr(")", str) > 0 Then
    Debug.Print "mis-matched parens"
  End If
End Function

Console:

? FindInParen("(aaa, bbb, ccc, ddd, (eee, xxx aaa))", "aaa", "xxx aaa")
not found
False

? FindInParen("(aaa, bbb, ccc, ddd, (eee, xxx aaa))", "eee", "xxx aaa")
found here: (eee, xxx aaa)
True
Sign up to request clarification or add additional context in comments.

4 Comments

I would add global (see tim williams answer below) and loop through all matching terms. Very helpful, thanks.
@JackBeNimble Adding "global" to the terms.Pattern would not make sense, as it always returns the empty string.
Ok, but the second test case above returns false for me, on both iterations of the function. Not sure why, it's hard to figure out the expressions.
@JackBeNimble I thought that was the whole point - the second test case is exactly the one you have in your question, and there you say that it should return false.
1

It's not really clear from your question exactly what you want (and maybe Regexp is not really needed here) but this might be close:

Sub Tester()
    RegexpTest ("(aaa, bbb, ccc, ddd, xxx aaa)")
End Sub


Sub RegexpTest(txt As String)
    Dim re As Object
    Dim allMatches, m

    Set re = CreateObject("VBScript.RegExp")
    re.Pattern = "([^,\(]*aaa)"
    re.ignorecase = True
    re.Global = True

    Set allMatches = re.Execute(txt)

    For Each m In allMatches
        Debug.Print Trim(m)
    Next m

End Sub

Comments

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.