My Node app gets an HTML page via axios, parses it via htmlparser2 then sends the valuable information to a frontend JS app as JSON.
The HTML page has some JavaScript in it that creates an array, and I need to work with that array in my code. htmlparser2 gets the content of the script as a string. I have two options to handle it as far as I know:
- Write a parser that goes through the string and extracts the required info (doable, but complicated)
- Run string as some JavaScript code and handle the values from that.
Assume I want to go with option 2. According to this StackOverflow question, using Node's VM module is possible, but the official documentation says "The node:vm module is not a security mechanism. Do not use it to run untrusted code."
I consider the code in my use case untrusted. What would be a safe solution for this?
EDIT: A snippet from the string:
hatizsakCucc = new Array();
hazbanCucc = new Array();
function adatokMessage(targyIndexStr,tomb) {
var targyIndex = parseInt(targyIndexStr);
if (tomb.length<1) alert("Nincs semmi!");
else alert(tomb[targyIndex]);
}
hatizsakCucc[0]="Név: ezüst\nSúly: 0.0001 kg.\nMennyiség: 453\nÖsszsúly: 0.0453 kg.\n";
hatizsakCucc[1]="Név: kaja\nSúly: 0.4 kg.\nÁr: 2 ezüst\nMennyiség: 68\nÖsszár: 136 ezüst\nÖsszsúly: 27.2 kg.\n";
hatizsakCucc[2]="Típus: fegyver\nNév: bot\nSúly: 2 kg.\nÁr: 6 ezüst\nMin. szint: 1\nMaximum sebzés: 6\nSebzés szórás: 5\nFajta: ütő/zúzó\n";
hatizsakCucc[3]="Típus: fegyver\nNév: parittya\nSúly: 0.3 kg.\nÁr: 14 ezüst\nMin. szint: 1\nMaximum sebzés: 7\nSebzés szórás: 4\nFajta: távolsági\n";
hatizsakCucc[4]="Név: csodatarisznya\nSúly: 4 kg.\nÁr: 1000 ezüst\nExtra: templomi árú\n";
hatizsakCucc[5]="Név: imamalom\nSúly: 5 kg.\nÁr: 150 ezüst\nExtra: templomi árú\n";
The whole string is about 100 lines of this, so it's not too much data.
What I need is the contents of the hatizsakCucc array. Actually, getting an array of that it not too difficult with a regex, I'm realizing now.
hatizsakSzkript.match(/hatizsakCucc(.*)\\n/g);
This gives me an array of the hatizsakCucc elements, so I guess my problem is solved.
That said, I'm still curious about the possibility of running "untrusted" code safely.
Further context: I plan parse each array element so it will be an object, the object elements will be the substring separated by the \n-s
So the expected result for the first array element will be:
hatizsakCucc[0]{
nev: "ezüst",
suly: 0.0001,
mennyiseg: ...
}
I'll write a function that splits the string to substrings at the \n then parse the data with a match().
some JavaScript in it that creates an array- can we see that code? Parsing javascript is not necessary complicated, there are quite a few tools that can handle that.