14

This is probably a simple problem, but unfortunately I wasn't able to get the results I wanted...

Say, I have the following line:

"Wouldn't It Be Nice" (B. Wilson/Asher/Love)

I would have to look for this pattern:

" (<any string>)

In order to retrieve:

B. Wilson/Asher/Love

I tried something like "" (([^))]*)) but it doesn't seem to work. Also, I'd like to use Match.Submatches(0) so that might complicate things a bit because it relies on brackets...

1

6 Answers 6

25

Edit: After examining your document, the problem is that there are non-breaking spaces before the parentheses, not regular spaces. So this regex should work: ""[ \xA0]*\(([^)]+)\)

""       'quote (twice to escape)
[ \xA0]* 'zero or more non-breaking (\xA0) or a regular spaces
\(       'left parenthesis
(        'open capturing group
[^)]+    'anything not a right parenthesis
)        'close capturing group
\)       'right parenthesis

In a function:

Public Function GetStringInParens(search_str As String)
Dim regEx As New VBScript_RegExp_55.RegExp
Dim matches
    GetStringInParens = ""
    regEx.Pattern = """[ \xA0]*\(([^)]+)\)"
    regEx.Global = True
    If regEx.test(search_str) Then
        Set matches = regEx.Execute(search_str)
        GetStringInParens = matches(0).SubMatches(0)
    End If
End Function
Sign up to request clarification or add additional context in comments.

8 Comments

Annoyingly, it doesn't seem to work. I tried your literal method as well as incorporating it into my method... It really seems to be in issue with the regex itself: as soon as I only replace the regex by a working regex all goes well. Anyway, I thought it might be useful to give you the exact .docm file I have now, so you can have a look: db.tt/6XoO1Pbn The input text is already in the doc. Thanks in advance!
See my edit. Looks like there are non-breaking spaces in the document. That's what was messing us up. Hope it works for you now.
This one is definitely working! I had some concern with the right boundary, resulting in a mismatch when a ) is mentioned in the parentheses content. I wanted to propose to let the regex find the last ) in the line. But Then I found this string: "They Called It Rock" (Lowe, Rockpile, Dave Edmunds) - 3:10 (bonus single-sided 45, credited as Rockpile, not on original LP). There goes my plan :) BTW, ) or ) - wouldn't work either since the dash can differ and there's sometimes nothing at all after the ). I guess this can't be improved, agreed?
I don't see the problem. It's matching Lowe, Rockpile, Dave Edmunds, and not (bonus ... LP). That's what you want, right? If you're seeing something different, I'm not sure why, but, no, I would say it can't be improved.
@BKSpurgeon test() is a method on the regular expression object regex. You pass test() a string as a parameter. If the string matches the Pattern attribute of regex, test() returns True. Otherwise, False. See msdn.microsoft.com/en-us/library/y32x2hy1(v=vs.84).aspx
|
4

Not strictly an answer to your question, but sometimes, for things this simple, good ol' string functions are less confusing and more concise than Regex.

Function BetweenParentheses(s As String) As String
    BetweenParentheses = Mid(s, InStr(s, "(") + 1, _
        InStr(s, ")") - InStr(s, "(") - 1)
End Function

Usage:

Debug.Print BetweenParentheses("""Wouldn't It Be Nice"" (B. Wilson/Asher/Love)")
'B. Wilson/Asher/Love

EDIT @alan points our that this will falsely match the contents of parentheses in the song title. This is easily circumvented with a little modification:

Function BetweenParentheses(s As String) As String
    Dim iEndQuote As Long
    Dim iLeftParenthesis As Long
    Dim iRightParenthesis As Long

    iEndQuote = InStrRev(s, """")
    iLeftParenthesis = InStr(iEndQuote, s, "(")
    iRightParenthesis = InStr(iEndQuote, s, ")")

    If iLeftParenthesis <> 0 And iRightParenthesis <> 0 Then
        BetweenParentheses = Mid(s, iLeftParenthesis + 1, _
            iRightParenthesis - iLeftParenthesis - 1)
    End If
End Function

Usage:

Debug.Print BetweenParentheses("""Wouldn't It Be Nice"" (B. Wilson/Asher/Love)")
'B. Wilson/Asher/Love
Debug.Print BetweenParentheses("""Don't talk (yell)""")
' returns empty string

Of course this is less concise than before!

5 Comments

I thought of suggesting this, too, but it falsely matches "Don't Talk (Put Your Head on My Shoulder)"
+1 for suggesting something other than the OP's preferred method.
Yeah, I appreciate the different approach. I do think I still prefer Regex. I don't know about the efficiency of it (speed is not my greatest concern) but I just like the compact notation. My main concern with this method is that it doesn't seem very specific. The left boundary is initially established as the last " of the string. If artist name contains any quote this will cause problems. So I still prefer to use " ( as left boundary.
Thanks for the feedback, it is appreciated. The solution that solves the problem is the best one, regardless of the exact implementation. As you can see, there are several ways to reach your goal (extracting a substring). Focus on the goal, rather than a particular way of reaching it. Requiring that your goal be reached only by a specific path limits your options.
@KeyMs92: What if the artist name contains " (? My point is, you have to define your problem precisely, otherwise any solution, regex or not, will have false positives / false negatives.
3

This a nice regex

".*\(([^)]*)

In VBA/VBScript:

Dim myRegExp, ResultString, myMatches, myMatch As Match
Dim myRegExp As RegExp
Set myRegExp = New RegExp
myRegExp.Pattern = """.*\(([^)]*)"
Set myMatches = myRegExp.Execute(SubjectString)
If myMatches.Count >= 1 Then
    Set myMatch = myMatches(0)
    If myMatch.SubMatches.Count >= 3 Then
        ResultString = myMatch.SubMatches(3-1)
    Else
        ResultString = ""
    End If
Else
    ResultString = ""
End If

This matches

Put Your Head on My Shoulder

in

"Don't Talk (Put Your Head on My Shoulder)"  

Update 1

I let the regex loose on your doc file and it matches as requested. Quite sure the regex is fine. I'm not fluent in VBA/VBScript but my guess is that's where it goes wrong

If you want to discuss the regex some further that's fine with me. I'm not eager to start digging into this VBscript API which looks arcane.

Given the new input the regex is tweaked to

".*".*\(([^)]*)

So that it doesn't falsely match (Put Your Head on My Shoulder) which appears inside the quotes.

enter image description here

10 Comments

Thanks for your response. Unfortunately there don't seem to be any matches using this pattern. Let me give you the source I'm testing this on: tiny.cc/ij3ffw.
@KeyMs92 The examples on that webpage are more clear. I updated my answer
Yeah, I should have given a better example. Seem my OP.
My regex matches the string "B. Wilson/Asher/Love" in group 1. Let me know if you have any more questions.
It seems the problem is with the regex itself. Using Match doesn't work in any case. I've uploaded my docm file in one of the comments so you can have a look.
|
2

This function worked on your example string:

Function GetArtist(songMeta As String) As String
  Dim artist As String
  ' split string by ")" and take last portion
  artist = Split(songMeta, "(")(UBound(Split(songMeta, "(")))
  ' remove closing parenthesis
  artist = Replace(artist, ")", "")
End Function

Ex:

Sub Test()

  Dim songMeta As String

  songMeta = """Wouldn't It Be Nice"" (B. Wilson/Asher/Love)"

  Debug.Print GetArtist(songMeta)

End Sub

prints "B. Wilson/Asher/Love" to the Immediate Window.

It also solves the problem alan mentioned. Ex:

Sub Test()

  Dim songMeta As String

  songMeta = """Wouldn't (It Be) Nice"" (B. Wilson/Asher/Love)"

  Debug.Print GetArtist(songMeta)

End Sub

also prints "B. Wilson/Asher/Love" to the Immediate Window. Unless of course, the artist names also include parentheses.

2 Comments

I like it, but I want to be as specific as possible, so I prefer to use " (
I don't see how that makes a difference. Can you explain?
1

This another Regex tested with a vbscript (?:\()(.*)(?:\)) Demo Here


Data = """Wouldn't It Be Nice"" (B. Wilson/Asher/Love)"
wscript.echo Extract(Data)
'---------------------------------------------------------------
Function Extract(Data)
Dim strPattern,oRegExp,Matches
strPattern = "(?:\()(.*)(?:\))"
Set oRegExp = New RegExp
oRegExp.IgnoreCase = True 
oRegExp.Pattern = strPattern
set Matches = oRegExp.Execute(Data) 
If Matches.Count > 0 Then Extract = Matches(0).SubMatches(0)
End Function
'---------------------------------------------------------------

Comments

0

I think you need a better data file ;) You might want to consider pre-processing the file to a temp file for modification, so that outliers that don't fit your pattern are modified to where they'll meet your pattern. It's a bit time consuming to do, but it is always difficult when a data file lacks consistency.

Comments

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.