How do I use regex to convert Slack URLs to BB Code?

Question

I'm trying to use regex to convert Slack's version of markdown formatting to BB Code. I'm stuck on links at the moment. Slack formats like this:

<www.url.com|This is the actual text>
<www.url.com>

BB Code formats like this:

[url=www.url.com]This is the actual text[/url]
[url]www.url.com[/url]

I'm dealing with the first type using this (in javascript)

string.replace(/\<([\s\S]+)(?=\|)\|([\s\S]*?)\>/gm, "[url=$1]$2[/url]"

I'm struggling to make a second rule that will only match text between <...> if there isn't a | in the string. Can anyone help me out?

Also if there's a neat way of dealing with both options in one go then let me know!

That's not Markdown. Please only use the markdown tag for questions about Markdown. — Chris
– Chris, Commented Mar 29, 2022 at 12:09
Apologies, Slack calls it their version of Markdown but I agree, it is very different! — David
– David, Commented Mar 29, 2022 at 13:05

Wiktor Stribiżew · Accepted Answer · 2022-03-29 09:48:22Z

2

You can use

const text = `<www.url.com|This is the actual text>
<www.url.com>`;
console.log( text.replace(/<([^<>|]*)(?:\|([^<>]*))?>/g, (x, url, text) => text !== undefined ?
 `[url=${url}]${text}[/url]` : `[url]${url}[/url]`) )

See the regex demo. Details:

< - a < char (please NEVER escape this char in any regex flavor if you plan to match a < char)
([^<>|]*) - Group 1: any zero or more chars other than <, > and |
(?:\|([^<>]*))? - an optional non-capturing group matching one or zero occurrences of a | and then any zero or more chars other than < and > captured into Group 2
> - a > char (again, please never escape the > literal char in any regex flavor).

answered Mar 29, 2022 at 9:48

Wiktor Stribiżew

631k41 gold badges502 silver badges632 bronze badges

Sign up to request clarification or add additional context in comments.

7 Comments

anotherGatsby Over a year ago

Could you give more info on why <> should not be escaped?

Wiktor Stribiżew Over a year ago

@anotherGatsby Because it is not a regex special metacharacter and in some regex flavors, \< / \> are word boundaries. And still a great amount of users think that "regex is universal" and will use the same pattern across languages/environments.

anotherGatsby Over a year ago

I knew that they are not special characters but did not know that they were word boundaries in some flavor. Thanks for the info.

David Over a year ago

Thanks for this super fast response! I'm just getting my head around your notes to make sure I understand it.

David Over a year ago

@WiktorStribiżew Am I right in thinking that this will fail if the URL has any of the non-captured characters in it? (<, > and |)? Hopefully this isn't a common issue but perhaps it could be a problem sometimes. Any ideas?

|

Collectives™ on Stack Overflow

How do I use regex to convert Slack URLs to BB Code?

1 Answer 1

7 Comments

Your Answer

Hot Network Questions

Collectives™ on Stack Overflow

1 Answer 1

7 Comments

Your Answer

Sign up or log in

Post as a guest

Related