0

I am looking for a way to negate a previously set matching pattern in order to pull out everything that is in between two characters.

I have the following code matching comments in SQL code in the "/* comment */" format. It will pick up the original code in column A and then strip the comments, placing the trimmed string in column B:

Sub FindComments()

Dim xOutArr As Variant
Dim RegEx As Object
Dim xOutRg As Range
Dim SQLString As Variant
Dim i As Integer
Dim lr As Long

lr = Worksheets("Sheet1").Cells(Rows.count, "A").End(xlUp).Row
For i = 2 To lr
    SQLString = Worksheets("Sheet1").Cells(i, "A").Value
    Set RegEx = CreateObject("VBScript.RegExp")
    With RegEx
        .Global = True
        .MultiLine = True
        .IgnoreCase = False
        .Pattern = "(/\*(.*?)\*/)"
    End With

    If RegEx.test(SQLString) Then
        SQLString = RegEx.replace(SQLString, "")
    End If
    Set RegEx = Nothing

    xOutArr = VBA.Split(SQLString, ";")
    Set xOutRg = Worksheets("Sheet1").Range("B" & (Worksheets("Sheet1").Cells(Rows.count, "B").End(xlUp).Row + 1))
    xOutRg.Range("A1").Resize(UBound(xOutArr) + 1, 1) = Application.WorksheetFunction.Transpose(xOutArr)    

Next i

End Sub

The code above will find anything written in between "/* " and " */" and then remove it, but I want to be able to also pull out anything that is in between two characters. I need to be able to match everything that does not satisfy that pattern (or some other pattern like "< comment >"). This includes line breaks, etc etc. This is specifically for VBA, and it needs to be able to search the entire string for any and all instances that that pattern appears. My goal is to put the contents in between those characters (in the pattern) into column C.

What would be the RegExp pattern for this?

Examples of SQLString would be:

1) /* Step 1 */ Select * from dual ;

2) /* Step 2 */ Select * from dual ; /* Step 3 */ Select * from Table

I am capturing the SQL code by removing the "/* Step # */" but I want to capture what is in those comments as well (in Column C). 1) and 2) are single rows. 2) has multiple queries. Each row is getting split by ";" in order to run queries one by one.

0

1 Answer 1

1

Instead of using Test you can use Match to get all matching strings from the SQL: loop over the match collection, storing each one in Col C and use Replace() to remove it from the original SQL:

Sub Tester()
    ExtractComments Range("A1")
End Sub



Sub ExtractComments(c As Range)
    Dim re As Object
    Dim allMatches, m, txt, comm

    Set re = CreateObject("VBScript.RegExp")
    re.Pattern = "(/\*(.*?)\*/)"
    re.ignorecase = True
    re.MultiLine = True
    re.Global = True

    txt = c.Value

    Set allMatches = re.Execute(txt)
    For Each m In allMatches
        comm = comm & IIf(Len(comm) > 0, vbLf, "") & m
        txt = Replace(txt, m, "")
        Debug.Print Trim(m)
    Next m
    c.Offset(0, 1).Value = txt
    c.Offset(0, 2).Value = comm
End Sub
Sign up to request clarification or add additional context in comments.

4 Comments

Is it not possible to negate a pattern using VBA / VBScript? I've seen posts saying (?! pattern ) is what would be needed, but that doesn't return anything different than the original text with both SQL and comments. Your answer would be a sufficient workaround as it stands, but I am wondering if that truly is the only way to handle inverse matching of regex patterns.
I guess I'm not clear on what would constitute "inverse matching" here - it's much more straightforward to identify the comments than the "non-comments"
Basically I know what I want to keep is going to appear in a pattern (symbol + random string + symbol). I am wondering if there is a way to say "Match everything in this string that does not satisfy this pattern" using RegExp. Once I can select the stuff I don't want to keep, I will replace it with nothing.
Let's say that I want to match "/* Name */" that is within the query "/ Name */ SELECT * FROM dual ;" . I can replace it with nothing to leave just the query. I already have something that matches the "/ Name */", and now I want to match the rest (i.e. "SELECT * FROM dual ; "). I can replace THIS with nothing to leave just the name.

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.