4

I'm trying to use regex to convert Slack's version of markdown formatting to BB Code. I'm stuck on links at the moment. Slack formats like this:

<www.url.com|This is the actual text>
<www.url.com>

BB Code formats like this:

[url=www.url.com]This is the actual text[/url]
[url]www.url.com[/url]

I'm dealing with the first type using this (in javascript)

string.replace(/\<([\s\S]+)(?=\|)\|([\s\S]*?)\>/gm, "[url=$1]$2[/url]"

I'm struggling to make a second rule that will only match text between <...> if there isn't a | in the string. Can anyone help me out?

Also if there's a neat way of dealing with both options in one go then let me know!

2
  • That's not Markdown. Please only use the markdown tag for questions about Markdown. Commented Mar 29, 2022 at 12:09
  • Apologies, Slack calls it their version of Markdown but I agree, it is very different! Commented Mar 29, 2022 at 13:05

1 Answer 1

2

You can use

const text = `<www.url.com|This is the actual text>
<www.url.com>`;
console.log( text.replace(/<([^<>|]*)(?:\|([^<>]*))?>/g, (x, url, text) => text !== undefined ?
 `[url=${url}]${text}[/url]` : `[url]${url}[/url]`) )

See the regex demo. Details:

  • < - a < char (please NEVER escape this char in any regex flavor if you plan to match a < char)
  • ([^<>|]*) - Group 1: any zero or more chars other than <, > and |
  • (?:\|([^<>]*))? - an optional non-capturing group matching one or zero occurrences of a | and then any zero or more chars other than < and > captured into Group 2
  • > - a > char (again, please never escape the > literal char in any regex flavor).
Sign up to request clarification or add additional context in comments.

7 Comments

Could you give more info on why <> should not be escaped?
@anotherGatsby Because it is not a regex special metacharacter and in some regex flavors, \< / \> are word boundaries. And still a great amount of users think that "regex is universal" and will use the same pattern across languages/environments.
I knew that they are not special characters but did not know that they were word boundaries in some flavor. Thanks for the info.
Thanks for this super fast response! I'm just getting my head around your notes to make sure I understand it.
@WiktorStribiżew Am I right in thinking that this will fail if the URL has any of the non-captured characters in it? (<, > and |)? Hopefully this isn't a common issue but perhaps it could be a problem sometimes. Any ideas?
|

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.