2

I am trying to convert

  myHtml = `
  <span style="font-family: &quot;Open Sans&quot;, Arial, sans-serif; font-size: 14px; text-align: justify;">
  
  In nec <i>convallis</i> justo. Quisque egestas mollis nibh non hendrerit. <strong>Phasellus</strong> tempus sapien in ultricies aliquet. Maecenas nec risus viverra tortor rhoncus venenatis in sit amet enim. Integer id ipsum non leo finibus sagittis in eu velit. Curabitur sed dolor dui. <span>Mauris <strong>aliquam <i>magna</i></strong> a ipsum</span> tincidunt tempor vitae sit amet ante. Maecenas pellentesque augue vitae quam faucibus, vel convallis dolor placerat. Pellentesque semper justo a turpis euismod, ac gravida enim suscipit.</span>
  `;

into

  data = [
    {
      openTag:
        '<span style="font-family: &quot;Open Sans&quot;, Arial, sans-serif; font-size: 14px; text-align: justify;">',
      closingTag: "</span>",
      children: [
        { value: "In nec" },
        { openTag: "<i>", value: "convallis", closingTag: "</i>" },
        { value: " justo. Quisque egestas mollis nibh non hendrerit. " },
        { openTag: "<strong>", value: "Phasellus", closingTag: "</strong>" },
        {
          value:
            " tempus sapien in ultricies aliquet. Maecenas nec risus viverra tortor rhoncus venenatis in sit amet enim. Integer id ipsum non leo finibus sagittis in eu velit. Curabitur sed dolor dui. "
        },
        {
          opentag: "<span>",
          children: [
            { value: "Mauris " },
            {
              opentag: "<strong>",
              childeren: [
                { value: "aliquam" },
                { opentag: "<i>", value: "magna", closingTag: "</i>" }
              ],
              closingTag: "</strong>"
            },
            { value: " a ipsum" }
          ],
          closingTag: "</span>"
        },
        {
          value:
            " tincidunt tempor vitae sit amet ante. Maecenas pellentesque augue vitae quam faucibus, vel convallis dolor placerat. Pellentesque semper justo a turpis euismod, ac gravida enim suscipit."
        }
      ]
    }
  ];

Current Output

{
  "rawTagName": null,
  "children": [
    {
      "children": []
    },
    {
      "rawTagName": "span",
      "children": [
        {
          "value": "\n  \n  In nec "
        },
        {
          "rawTagName": "i",
          "value": "convallis"
        },
        {
          "value": " justo. Quisque egestas mollis nibh non hendrerit. "
        },
        {
          "rawTagName": "strong",
          "value": "Phasellus"
        },
        {
          "value": " tempus sapien in ultricies aliquet. Maecenas nec risus viverra tortor rhoncus venenatis in sit amet enim. Integer id ipsum non leo finibus sagittis in eu velit. Curabitur sed dolor dui. "
        },
        {
          "rawTagName": "span",
          "children": [
            {
              "children": []
            },
            {
              "rawTagName": "strong",
              "children": [
                {
                  "value": "aliquam "
                },
                {
                  "rawTagName": "i",
                  "value": "magna"
                }
              ]
            },
            {
              "children": []
            }
          ]
        },
        {
          "value": " tincidunt tempor vitae sit amet ante. Maecenas pellentesque augue vitae quam faucibus, vel convallis dolor placerat. Pellentesque semper justo a turpis euismod, ac gravida enim suscipit."
        }
      ]
    },
    {
      "children": []
    }
  ]
}

Below is my current approach using recursion

import { parse } from "node-html-parser";

// Write Javascript code!

const myHtml = `
  <span style="font-family: &quot;Open Sans&quot;, Arial, sans-serif; font-size: 14px; text-align: justify;">
  
  In nec <i>convallis</i> justo. Quisque egestas mollis nibh non hendrerit. <strong>Phasellus</strong> tempus sapien in ultricies aliquet. Maecenas nec risus viverra tortor rhoncus venenatis in sit amet enim. Integer id ipsum non leo finibus sagittis in eu velit. Curabitur sed dolor dui. <span>Mauris <strong>aliquam <i>magna</i></strong> a ipsum</span> tincidunt tempor vitae sit amet ante. Maecenas pellentesque augue vitae quam faucibus, vel convallis dolor placerat. Pellentesque semper justo a turpis euismod, ac gravida enim suscipit.</span>
  `;

const tranform = str => {
  const nodeAsObject = root => {
    if (root.childNodes.length === 0) {
      return { value: root.rawText };
    }
    if (
      root.childNodes.length === 1 &&
      root.childNodes[0].childNodes.length === 0
    ) {
      return {
        rawTagName: root.rawTagName,
        value: root.rawText
      };
    }
    return {
      rawTagName: root.rawTagName,
      children: root.childNodes.map(x => {
        return {
          rawTagName: x.rawTagName,
          children: x.childNodes.map(y => {
            return nodeAsObject(y);
          })
        };
      })
    };
  };
  return nodeAsObject(parse(str));
};

console.log(tranform(myHtml));

const pre = document.getElementById("pre");
pre.innerHTML = JSON.stringify(tranform(myHtml), null, "  ");

Stackblitz Demo Here

1 Answer 1

1

I went for the recursive approach and created an output that is similar to your expected output.

const myHtml = `
  <span style="font-family: &quot;Open Sans&quot;, Arial, sans-serif; font-size: 14px; text-align: justify;">

  In nec <i>convallis</i> justo. Quisque egestas mollis nibh non hendrerit. <strong>Phasellus</strong> tempus sapien in ultricies aliquet. Maecenas nec risus viverra tortor rhoncus venenatis in sit amet enim. Integer id ipsum non leo finibus sagittis in eu velit. Curabitur sed dolor dui. <span>Mauris <strong>aliquam <i>magna</i></strong> a ipsum</span> tincidunt tempor vitae sit amet ante. Maecenas pellentesque augue vitae quam faucibus, vel convallis dolor placerat. Pellentesque semper justo a turpis euismod, ac gravida enim suscipit.</span>
  `;

let revisedHtml;

const parser = htmlStr => {
  let htmlElements = parse(htmlStr);
  revisedHtml = htmlElements.childNodes.map(node => {
    return createTranslatedNode(node);
  });
  return htmlElements;
};

const createTranslatedNode = node => {
  let currentNode = {};
  // This is a textNode.
  if (!node.rawTagName) {
    currentNode = { value: node?.rawText?.trim() };
  }
  // This is a tagNode
  if (node.rawTagName) {
    currentNode = {
      openTag: `<${node.rawTagName} ${node.rawAttrs}>`,
      closingTag: `</${node.rawTagName}>`
    };
  }

  if (node?.childNodes?.length === 1 && !node?.childNodes[0].rawTagName) {
    currentNode.value = node?.childNodes[0].rawText?.trim();
  }

  if (node?.childNodes?.length > 1) {
    currentNode.children = node.childNodes.map(childNode => {
      return createTranslatedNode(childNode);
    });
  }

  return currentNode;
};

console.log(revisedHtml);

I do some assumptions with the open and close tags since I just do some string concatination to add the <> around the tags. Other than that I trim() the value inputs to remove the unwanted whitespace around the value.

This does make some assumptions about the html, like it always having start and closing tags, and such. A further improvement that could be done would be to test for that also.

Sign up to request clarification or add additional context in comments.

1 Comment

Thanks, let me try this out

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.