207

I am running into issues when trying to use the DOMParser in my js code. In my code, I retrieve an xml file via xmlhttp.responseText soap response. I want to be able to access its elements in JSON format, so my code looks like:

var xml = new DOMParser();
xml = xml.parseFromString(xmlhttp.responseText, 'text/xml');
var result = xmlToJson(xml);

I get this error message: ReferenceError: DOMParser is not defined

Edit: This link hasn't worked for me because my javascript isn't in the HTML page, as it is a node.js file. JavaScript DOMParser access innerHTML and other properties

11 Answers 11

273
+350

A lot of browser functionalities, like DOM manipulations or XHR, are not available natively NodeJS because that is not a typical server task to access the DOM - you'll have to use an external library to do that.

DOM capacities depends a lot on the library, here's a quick comparisons of the main tools you can use:

  • jsdom: implements DOM level 4 which is the latest DOM standard, so everything that you can do on a modern browser, you can do it in jsdom. It is the de-facto industry standard for doing browser stuff on Node, used by Mocha, Vue Test Utils, Webpack Prerender SPA Plugin, and many other:

    const jsdom = require("jsdom");
    const dom = new jsdom.JSDOM(`<!DOCTYPE html><p>Hello world</p>`);
    dom.window.document.querySelector("p").textContent; // 'Hello world'
    
  • deno_dom: if using Deno instead of Node is an option, this library provides DOM parsing capabilities:

    import { DOMParser } from "https://deno.land/x/deno_dom/deno-dom-wasm.ts";
    const parser = new DOMParser();
    const document = parser.parseFromString('<p>Hello world</p>', 'text/html');
    document.querySelector('p').textContent; // 'Hello world';
    
  • htmlparser2: same as jsdom, but with enhanced performances and flexibility at the price of a more complex API:

    const htmlparser = require("htmlparser2");
    const parser = new htmlparser.Parser({
      onopentag: (name, attrib) => {
        if (name=='p') console.log('a paragraph element is opening');
      }
    }, {decodeEntities: true});
    parser.write(`<!DOCTYPE html><p>Hello world</p>`);
    parser.end();
    // console output: 'a paragraph element is opening'
    
  • cheerio: implementation of jQuery based on HTML DOM parsing by htmlparser2:

    const cheerio = require('cheerio');
    const $ = cheerio.load(`<!DOCTYPE html><p>Hello world</p>`);
    $('p').text('Bye moon');
    $.html(); // '<!DOCTYPE html><p>Bye moon</p>'
    
  • xmldom: fully implements the DOM level 2 and partially implements the DOM level 3. Works with HTML, and with XML also

  • dom-parser: regex-based DOM parser that implements a few DOM methods like getElementById. Since parsing HTML with regular expressions is a very bad idea I wouldn't recommend this one for production.

Sign up to request clarification or add additional context in comments.

Comments

34

I used jsdom because it's got a ton of usage and is written by a prominent web hero - no promises that it's behavior perfectly matches your browser (or even that every browser's behavior is the same) but it worked for me:

const jsdom = require("jsdom")
const { JSDOM } = jsdom
global.DOMParser = new JSDOM().window.DOMParser

Comments

23

You can use a Node implementation of DOMParser, such as xmldom. This will allow you to access DOMParser outside of the browser. For example:

var DOMParser = require('xmldom').DOMParser;
var parser = new DOMParser();
var document = parser.parseFromString('Your XML String', 'text/xml');

5 Comments

it does not work - just spits a lot of error "entity not found" when trying to parse an html page
xmldom is an old library. No querySelector support
xmldom is a poor substitute for DOMParser. It does not correctly parse peer elements.
It satisfies my needs as I am actually implementing this logic! Thanks, gist.github.com/chinchang/8106a82c56ad007e27b1
This is the replacement for the older xmldom package: npmjs.com/package/@xmldom/xmldom
18

There is no DOMParser in node.js, that's a browser thing. You can try any of these modules though:

https://github.com/joyent/node/wiki/modules#wiki-parsers-xml

2 Comments

I know this thread is old, but here it goes. What about jquery for node? An ajax call with dataType xml should receive an xml dom response.
The link provided appears to have expired.
5

I really like htmlparser2. It's a fantastic, fast and lightweight library. I've created a small demo on how to use it on RunKit: https://runkit.com/jfahrenkrug/htmlparser2-demo/1.0.0

1 Comment

it's good to know that additional packages that make it actually useful (css-select, domutils) are not free
3
var DOMParser = require('xmldom').DOMParser;
var doc = new DOMParser().parseFromString(
    '<xml xmlns="a" xmlns:c="./lite">\n'+
        '\t<child>test</child>\n'+
        '\t<child></child>\n'+
        '\t<child/>\n'+
    '</xml>'
    ,'text/xml');

Comments

3

I needed DOMParser in node. This is what I used

const jsdom = require('jsdom');
const {JSDOM} = jsdom;

class DOMParser {
  parseFromString(s, contentType = 'text/html') {
    return new JSDOM(s, {contentType}).window.document;
  }
}

now the example from this answer works for me.

function htmlDecode(input) {
  var doc = new DOMParser().parseFromString(input, "text/html");
  return doc.documentElement.textContent;
}

console.log(  htmlDecode("&lt;img src='myimage.jpg'&gt;")  )    
// "<img src='myimage.jpg'>"

Comments

1

I use yet another DOM parser from html string to DOM and back > Himalaya, or at npmjs.com:

import { parse, stringify } from 'himalaya';

const dom = parse(htmlString)

// Do something here

const htmlStringNext = stringify(dom)

Comments

0

RSS parser is easy for parsing Atom feeds. I you are using NextJs for example you can simply create an API like so:

import Parser from 'rss-parser'

export default async function API(req, res) {
    let parser = new Parser();
    try {
        const feed = await parser.parseURL(`https://www.nasa.gov/rss/dyn/lg_image_of_the_day.rss`);
        if (feed) return res.json({ "message": `Here is your data feed title`, status: 200, data: feed.title })
    } catch (error) {
        return res.json({ "message": "You made an invalid request", status: 401 })
    }
}

Comments

0

you can as well use html-to-text in case you need html to text conversion as can be seen in few answers to this question.

const { convert } = require('html-to-text');
// There is also an alias to `convert` called `htmlToText`.

const options = {
  wordwrap: 130,
  // ...
};
const html = '<div>Hello World</div>';
const text = convert(html, options);
console.log(text); // Hello World

Comments

-3

I used jsdom instead of domparser, it works

Comments

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.