Here is a method that works:
- Turn your string into an array by
.split() on the endline char:
var arr = string.split('\n');
- Loop through the array, using this regex to remove any PHP strings
let cln = arr[i].replace(/(.*)(<\?.*\?>)(.*)/, "$1$3");
- Store each "cleaned" line into a new array
Explanation of the RegEx (step 2)
() - represents a "capture group" (each line will be split into three captured groups)
(.*) - the first group contains all the characters from the beginning UNTIL...
(...) - the 2nd capture group contains from <? to ?> and all chars in between (i.e. a PHP string)
(.*) - the 3rd group contains any characters following a PHP string
$1 $2 $3 - are the contents of the three capture groups
So, each line (as it is processed by the for loop) is split up into these three groupings of characters.
On MOST of the lines, groups 2 and 3 are empty. Group 1 is the entire line. So, returning group 1 and group 3 returns the entire line.
On lines that contain a PHP string, the PHP is in capture group 2 (which is never returned). Returning group 1 and group 3 either returns an empty string, or it might return some spaces that preceeded or followed the PHP string. So, we also use .trim() to remove those. If the line with spaces removed is zero-length, we do not include in the new (output) array.
var htm = `<?php echo Hi Friends ?>
<!DOCTYPE html>
<html lang="en" dir="ltr">
<head>
<meta charset="utf-8">
<title></title>
<script type="text/javascript" src="js/loader.js" defer> </head>
<body>
<p>Hello</p>
<?php echo Hi Friend ?>
</body>
</html>`;
//console.log(htm);
var out=[], arr=htm.split('\n');
for ( var i=0; i<(arr.length); i++ ){
let cln = arr[i].replace(/(.*)(<\?.*\?>)(.*)/, "$1$3");
//console.log(cln + ' - ' + cln.trim().length);
if (cln.trim().length>0){
out.push(cln.trim());
}
}
console.log(out.join("\n") );