16

I'm trying to write a regular expression that replaces a markdown-style links but it doesn't seem to be working. This is what I have so far:

# ruby code:
text = "[link me up](http://www.example.com)"
text.gsub!(%r{\[(\+)\]\((\+)\)}x, %{<a target="_blank" href="\\1">\\2</a>})

What am I doing wrong?

2
  • Why not use a full Ruby Markdown library, like the wonderful kramdown? Commented Feb 13, 2012 at 22:41
  • because I only need a limited subset of markdown features and haven't found a library that allows me to specify which features I want to support (so I'm having to write my own). Commented Feb 14, 2012 at 0:00

1 Answer 1

42
irb(main):001:0> text = "[link me up](http://www.example.com)"
irb(main):002:0> text.gsub /\[([^\]]+)\]\(([^)]+)\)/, '<a href="\2">\1</a>'
#=> "<a href=\"http://www.example.com\">link me up</a>"

We can use the extended option for Ruby's regex to make it not look like a cat jumped on the keyboard:

def linkup( str )
  str.gsub %r{
    \[         # Literal opening bracket
      (        # Capture what we find in here
        [^\]]+ # One or more characters other than close bracket
      )        # Stop capturing
    \]         # Literal closing bracket
    \(         # Literal opening parenthesis
      (        # Capture what we find in here
        [^)]+  # One or more characters other than close parenthesis
      )        # Stop capturing
    \)         # Literal closing parenthesis
  }x, '<a href="\2">\1</a>'
end

text = "[link me up](http://www.example.com)"
puts linkup(text)
#=> <a href="http://www.example.com">link me up</a>

Note that the above will fail for URLs that have a right parenthesis in them, e.g.

linkup "[O](http://msdn.microsoft.com/en-us/library/ms533050(v=vs.85).aspx)"
# <a href="http://msdn.microsoft.com/en-us/library/ms533050(v=vs.85">O</a>.aspx)

If this is important to you, you replace the [^)]+ with \S+(?=\)) which means "find as many non-whitespace-characters as you can, but ensure that there is a ) afterwards".


To answer your question "what am I doing wrong", here's what your regex said:

%r{
  \[      # Literal opening bracket   (good)
    (     # Start capturing           (good)
      \+  # A literal plus character  (OOPS)
    )     # Stop capturing            (good)
  \]      # Literal closing bracket   (good)
  \(      # Literal opening paren     (good)
    (     # Start capturing           (good)
      \+  # A literal plus character  (OOPS)
    )     # Stop capturing            (good)
  \)      # Literal closing paren     (good)
}x
Sign up to request clarification or add additional context in comments.

8 Comments

also, I've never seen a URL with parenthesis. I didn't even think this was valid. thanks for pointing that out
Awesome explanation. Would've given more than +1 if possible. I ended up doing a matcher for [[foobar]] kind of syntax using your advice. Here's my go at it (simplified from yours): /\[\[([^\]]+)\]\]/ .
To support tooltips, like [link](http://example.com "tooltip"), use this regex: /\[([^\]]+)\]\(([^)"]+)(?: \"([^\"]+)\")?\)/
Just wanted to note that based on the common mark spec, opening and closing brackets in a URL are allowed. So a url like http://example.com?query[]=something should be allowed, but the regex provided does not account for that.
@thatidiotguy Incorrect. The regex prevents a closed bracket within the link text, but not within the URL. See rubular.com/r/kG7s9bHlOl for example.
|

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.