0

I have a string:

  "some text 0 <span>span 0 </span>some text 1<span>span 1</span>"

I would like to transform it into some kind of structure like:

[
    { text: 'some text 0' },
    { span: 'span 0' },
    { text: 'some text 1' },
    { span: 'span 1' }
]

I know I can cast it to jquery and use find to get an array of spans, but is there a way to get an array as above?

Thanks!

5
  • what you should try for this? Commented Nov 29, 2018 at 5:54
  • Are there ever any elements nested inside the <span>s? Commented Nov 29, 2018 at 5:56
  • no, the spans are all at that level, and there is nothing nested in them but text Commented Nov 29, 2018 at 5:58
  • Check this answer. stackoverflow.com/questions/30455489/… Commented Nov 29, 2018 at 5:59
  • @MustafaTığ Using regular expressions to parse HTML often makes code more complicated than it has to be. In this case, Javascript already has built-in HTML parsers, better to use them than to resort to a regular expression Commented Nov 29, 2018 at 6:00

2 Answers 2

2

Because jQuery doesn't have very convenient methods for dealing with text nodes, I would prefer to use built-in Javascript to iterate over the childNodes and then .map them, extracting the textContent of the node, and the tagName (if the node is an element), or text (if the node is a text node):

const str = "some text 0 <span>span 0 </span>some text 1<span>span 1</span>";
const doc = new DOMParser().parseFromString(str, 'text/html');
const arr = [...doc.body.childNodes]
  .map((node) => ({
    [node.nodeType === 3 ? 'text' : node.tagName.toLowerCase()]: node.textContent
  }));
console.log(arr);

Sign up to request clarification or add additional context in comments.

1 Comment

See MDN - a type of 3 indicates a text node
1

Using regular exp, you can try as follows.

const regex = /([a-zA-Z0-9 ]*)\<span\>([a-z0-9 ]*)\<\/span\>/gm;
const str = `some text 0 <span>span 0 </span>some text 1<span>span 1</span>some<span>span 1</span>`;
let m;
let ar = [];
while ((m = regex.exec(str)) !== null) {
    // This is necessary to avoid infinite loops with zero-width matches
    if (m.index === regex.lastIndex) {
        regex.lastIndex++;
    }
    
    // The result can be accessed through the `m`-variable.
    m.forEach((match, groupIndex) => {
        //console.log(`Found match, group ${groupIndex}: ${match}`);
        if(groupIndex == 1){
          ar.push({"text":match});
        }
        else if(groupIndex == 2){
          ar.push({"span":match});
        }
    });
}
console.log(ar);

Comments

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.