2

I need to parse HTML files and extract any characters found within the following flag:

${message}

The message may contain words, whitespace, and even special characters. I have the following regex that seems to partially work:

/\$\{(.+)\}/g

What's happening with this pattern is it appears to be working backwards from the line break and finding the first }. The desired result would be to work forward and find the first }.

Here is the regex in RegExr: https://regexr.com/3ng3d

I have the following test case:

<div>
  <div class="panel-heading">
    <h2 class="panel-title">${Current Status}<span> - {{data.serviceDisplay}}</span></h2>
  </div>
  ${test}
  <div class="panel-body">
    <div>${We constantly monitor our services and their related components.} ${If there is ever a service interruption, a notification will be posted to this page.} ${If you are experiencing problems not listed on this page, you can submit a request for service.}</div>
    <div>
      <div>${No system is reporting an issue}</div>
    </div>
    <div>
      <a>{{outage.typeDisplay}} - {{outage.ci}} (${started {{outage.begin}}})
        <div></div>
      </a>
    </div>
    <div><a href="?id=services_status" aria-label="${More information, open current status page}">${More information...}
     </a></div>
  </div>
</div>

The regex should extract the following:

  1. Current Status
  2. test
  3. We constantly monitor our services and their related components.
  4. If there is ever a service interruption, a notification will be posted to this page.
  5. If you are experiencing problems not listed on this page, you can submit a request for service.
  6. No system is reporting an issue
  7. started {{outage.begin}}
  8. More information, open current status page
  9. More information...

But what I'm actually getting is...

  1. ${Current Status} - {{data.serviceDisplay}}
  2. ${test}
  3. ${We constantly monitor our services and their related components.} ${If 4. there is ever a service interruption, a notification will be posted to this page.} ${If you are experiencing problems not listed on this page, you can submit a request for service.}
  4. ${No system is reporting an issue}
  5. ${started {{outage.begin}}}
  6. ${More information, open current status page}">${More information...}

It appears my regex is working back from the \n and finding the first } which is what's giving me #1, #3, and #6.

How can I work from the start and find the first } as opposed to working backwards from the line break?

1
  • 1
    /\$\{(.+?)\}/g should give you closer to what you want (you need lazy matching instead of greedy), but it still has issues with #7. Commented Apr 6, 2018 at 17:58

1 Answer 1

2

Use RegExp.exec() to iterate the text and extract the capture group.

The pattern is /\$\{(.+?)\}(?=[^}]+?(?:{|$))/g - lazy matching of characters until closing curly bracket that is followed by a sequence that ends with opening curly brackets or end of string.

RegExr demo

var pattern = /\$\{(.+?)\}(?=[^}]+?(?:{|$))/g;
var text = '<div>\
  <div class="panel-heading">\
    <h1>${Text {{variable}} more text}</h1>\
    <h2 class="panel-title">${Current Status}<span> - {{data.serviceDisplay}}</span></h2>\
  </div>\
  ${test}\
  <div class="panel-body">\
    <div>${We constantly monitor our services and their related components.} ${If there is ever a service interruption, a notification will be posted to this page.} ${If you are experiencing problems not listed on this page, you can submit a request for service.}</div>\
    <div>\
      <div>${No system is reporting an issue}</div>\
    </div>\
    <div>\
      <a>{{outage.typeDisplay}} - {{outage.ci}} (${started {{outage.begin}}})\
        <div></div>\
      </a>\
    </div>\
    <div><a href="?id=services_status" aria-label="${More information, open current status page}">${More information...}\
     </a></div>\
  </div>\
</div>';

var result = [];
var temp;
while(temp = pattern.exec(text)) {
  result.push(temp[1]);
}

console.log(result);

Sign up to request clarification or add additional context in comments.

1 Comment

See updated answer. Added an example for ${Text {{variable}} more text} (the h1).

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.