1

How to find out pairs of opening and closing html tags in javascript?

So I've an array of parsed html:

/// this is just markup only : any inner text is omitted for simplicity.


const parsedHtml = [
    '<div class="container">',
    '<div class="wrapper">',
    '<h3>',
    '</h3>',
    '<p>',
    '</p>',
   '<span>',
    '<a href="#">',
     '<img src="./img.svg">',
    '</span>',
    '</div>',
    '</div>'
]

// this whole array is a block of html code (nesting is in the above order)

So the idea here is to find opening and closing tag pairs;

(just the index.)

So that I can separate out blocks of code ... like this:

<div class="container">
...
</div>


// or

<h3>
</h3>

//or 

<span>
...
</span>


Just need a way to find the index of closing tag that matches an opening tag. (think it as of opening blocks of code in vscode)

I could have done a check whether parsedHtml[i].startsWith('</')... but still this does not guarantee an opening and a closing pair like this:

<div> ---> opening

</div> --->  closing

[pair]

NOTE

This is for finding nesting of tags so that I can indent the html likewise && show each of them as blocks. I don't wanna use packages like parse5, marked, prismjs, or highlight js.

My requirement is custom. -> (Just to find the opening and closing tag pairs, so that I can find how things are nested from the above parsed html array)

2
  • use ide like visual code Commented Oct 24, 2020 at 5:11
  • No. This is for an html webpage... not to be done on vs code.. (we have extensions for that purpose ... right)... this is for parsing and displaying html in a specific manner inside a webpage.. Commented Oct 24, 2020 at 5:12

2 Answers 2

0

That's my approach:

var parsedHtml = [
   '<div class="container">',
   '<div class="wrapper">',
   '<h3>',
   '</h3>',
   '<p>',
   '</p>',
   '<span>',
   '<a href="#">',
   '<img src="./img.svg">',
   '</span>',
   '</div>',
   '</div>'
];
var getTag = (s) => s.replace(/<|>/gi, '').split(' ')[0];
var isCloseTag = (t) => t.includes('/');

var indices = parsedHtml.map(getTag).reduce(collectIndices, {});
console.log(JSON.stringify(indices)); // {"div":[[0,11],[1,10]],"h3":[[2,3]],"p":[[4,5]],"span":[[6,9]],"a":[[7]],"img":[[8]]}

function collectIndices(indices, tag, i) {
   const tagName = tag.replace('/', '');
   if (!(tagName in indices)) {
      indices[tagName] = [[i]];
      return indices;
   }
   if (isCloseTag(tag)) {
      indices[tagName].reverse().find((ins) => ins.length === 1).push(i);
      return indices;
   }
   indices[tagName].push([i]);
   return indices;
}
Sign up to request clarification or add additional context in comments.

Comments

0

I found this answer here using js regex: https://www.octoparse.com/blog/using-regular-expression-to-match-html

All you have to do is put the tag in that you are looking for.

If I were looking for the a tag: /<a\s*.*>\s*.*<\/a>/gi

You can test it out with this regex tool: https://regexr.com/

Comments

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.