0

I am working on an HTML editor in JavaScript, trying to implement an undo-feature.

So I have this HTML code (with hidden comments for storing app states):

<!-- RECONSTRUCT: 'test1' -->
<h1>FOO</h1>
<!-- END RECONSTRUCT -->
<h1>BAR</h1>
<!-- RECONSTRUCT: 'test2' -->
<h1>FOOFOO</h1>
<!-- END RECONSTRUCT -->

which I need to transform into this HTML code:

test1
<h1>BAR</h1>
test2

So basically, the html comments "save" an old state which I need to restore the code to.

So what I want a Regex to achieve is:

[0:"test1", 1:"<h1>FOO</h1>", 2:"test2", 3:"<h1>FOOFOO</h1>"]

or something similar.

The problem is, when I try to use Regex like this:

src.match(/<!-- RECONSTRUCT: '(.*)' -->(.*)<!-- RECONSTRUCT END -->/g)

I get

[0: "<!-- RECONSTRUCT: 'test1' --> ... FOO ... BAR <!-- RECONSTRUCT ... FOOFOO ... ->"]

so the complete input as an result, since its a valid match. I also don't get it working with negative look ahead:

<!-- RECONSTRUCT: '(.*)' -->((?!RECONSTRUCT:).)*

1 Answer 1

3

In JavaScript . does not match newline characters and there is no modifier (like s) available for overriding this behaviour. The way to do it in JavaScript is to use [^] instead of ., when newlines are allowed to match also.

Also make some patterns lazy with ?:

var src = `<!-- RECONSTRUCT: 'test1' -->
<h1>FOO</h1>
<!-- END RECONSTRUCT -->
<h1>BAR</h1>
<!-- RECONSTRUCT: 'test2' -->
<h1>FOOFOO</h1>
<!-- END RECONSTRUCT -->`;

src = src.replace(
    /<!--\s*RECONSTRUCT:\s*'(.*?)'\s*-->[^]*?<!--\s*END RECONSTRUCT\s*-->/g, '$1');

console.log(src);

Sign up to request clarification or add additional context in comments.

5 Comments

Wow! Thanks for the quick and amazingly light answer! Really appreciate it, and wish I could tip you a drink for it.
Quick question: how does your regex solves the "nested occurences"-problem? Why is the input not interpreted as a single (and only) result like in my trys?
When I tried your match, I got null as result (which is normal because .* does not go across newlines), so I cannot really compare. Also, I don't see any truly nested occurrences in your sample. What do you mean with nested?
like that: all input is a valid match since it starts with <!-- RECONSTRUCT ... and ends with "<!-- END RECONSTRUCT -->". (with an other reconstruct in between). That was my main problem with this task when using match(). maybe its just replace(), which "solves" it?
But in the sample they are not truly nested as the first END RECONSTRUCT belongs to the first RECONSTRUCT, and the second to the second. One stops before the other begins. As I said, I could not reproduce what you get with match.

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.