4

Let's say that I have a given string in javascript - e.g., var s = "{{1}}SomeText{{2}}SomeText"; It may be very long (e.g., 25,000+ chars).

NOTE: I'm using "SomeText" here as a placeholder to refer to any number of characters of plain text. In other words, "SomeText" could be any plain text string which doesn't include {{1}} or {{2}}. So the above example could be var s = "{{1}}Hi there. This is a string with one { curly bracket{{2}}Oh, very nice to meet you. I also have one } curly bracket!"; And that would be perfectly valid.

The rules for it are simple:

It does not need to have any instances of {{2}}. However, if it does, then after that instance we cannot encounter another {{2}} unless we find a {{1}} first.

Valid examples:

"{{2}}SomeText"

"{{1}}SomeText{{2}}SomeText"

"{{1}}SomeText{{1}}SomeText{{2}}SomeText"

"{{1}}SomeText{{1}}SomeText{{2}}SomeText{{1}}SomeText"

"{{1}}SomeText{{1}}SomeText{{2}}SomeText{{1}}SomeText{{1}}SomeText"

"{{1}}SomeText{{1}}SomeText{{2}}SomeText{{1}}SomeText{{1}}SomeText{{2}}SomeText"

etc...

Invalid examples:

"{{2}}SomeText{{2}}SomeText"

"{{1}}SomeText{{2}}SomeText{{2}}SomeText"

"{{1}}SomeText{{2}}SomeText{{2}}SomeText{{1}}SomeText"

etc...

This seems like a relatively easy problem to solve - and indeed I could easily solve it without regular expressions, but I'm keen to learn how to do something like this with regular expressions. Unfortunately, I'm not even sure if "conditionals and lookaheads" is a correct description of the issue in this case.

NOTE: If a workable solution is presented that doesn't involve "conditionals and lookaheads" then I will edit the title.

7
  • Do you need to handle anything like a {{3}}? If so regex is probably not a good option. Commented Jan 7, 2014 at 20:00
  • I don't think you need conditionals or lookaheads for this Commented Jan 7, 2014 at 20:00
  • @p.s.w.g Nope, I only have {{1}}, {{2}}, and plain text. Commented Jan 7, 2014 at 20:02
  • But... but... why....... Commented Jan 7, 2014 at 20:14
  • 1
    @a.real.human.being yeah, I know. Just that it's severely complex use case, not really the place to start ;) Commented Jan 7, 2014 at 20:18

3 Answers 3

4

It's probably easier to invert the condition. Try to match any text that contains two consecutive instances of {{2}}, and if it doesn't match that, it's good.

Using this strategy, your pattern can be as simple as:

/{\{2}}([^{]*){\{2}}/

Demonstration

This will match a literal {{2}}, followed by zero or more characters other than {, followed by a literal {{2}}.

Notice that the second { needs to be escaped, otherwise, the regex engine will consider the {2} as to be a quantifier on the previous { (i.e. {{2} matches exactly two { characters).


Just in case you need to allow characters like {, and between the two {{2}}, you can use a pattern like this:

/{\{2}}((?!{\{1}}).)*{\{2}}/

Demonstration

This will match a literal {{2}}, followed by zero or more of any character, so long as those characters create a sequence like {{1}}, followed by a literal {{2}}.

Sign up to request clarification or add additional context in comments.

4 Comments

Geez, I figured the solution was going to be short, but not THAT short! Your english explanation makes perfect sense, but I do not understand how the regular expression works... I'll have to study it!
How would you classify this solution? I'd like to update the title of the question so that others might find it if they are facing a similar problem.
I like this strategy, but the regex still needs work. If the text in-between tags contains a "{" (which is valid), it throws it off.
@a.real.human.being Hmmm, I probably call this something like, 'finding strings that do not contain an unbroken repetition of a pattern' or 'prohibiting strings with unbroken repetitions of a pattern'.
0
(({{1}}SomeText)+({{2}}SomeText)?)*

Broken down:

({{1}}SomeText)+ - 1 to many {{1}} instances (greedy match)

({{2}}SomeText)? - followed by an optional {{2}} instance

Then the whole thing is wrapped in ()* such that the sequence can appear 0 to many times in a row.

No conditionals or lookaheads needed.

3 Comments

That SomeText after the {{1}} could contain a {{2}}, which would be an invalid match.
Apologies if it wasn't clear in the original post, but "SomeText" was just a placeholder for a plain text string (original post updated).
Ahh, I missed that {{1}} was optional, I'll revise.
0

You said you can have one instance of {2} first, right?

^(.(?!{2}))(.{2})?(?!{2})((.(?!{2})){1}(.(?!{2}))({2})?)$

Note if {2} is one letter replace all dots with [^{2}]

Comments

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.