I am novice powershell user please bear with that. I have tried to parse html table in powershell for strings between tags, Here is the HTML
<head>
<title>HTML TABLE</title>
</head><body>
<table>
<colgroup><col/><col/></colgroup>
<tr><th>TestcaseName</th><th>Status</th></tr>
<tr><td>abcd </td><td>First </td></tr>
<tr><td>xyz </td><td>Second </td></tr>
<tr><td>pqr </td><td>Third </td></tr>
</table>
</body>
</html>
Here is the code which I have tried
$arr = @()
$path = "C:\test.html"
$pattern = '(?i)<tr[^>]*><td[^>]*>(.*)</td><td>'
Get-Content $path | Foreach {if ([Regex]::IsMatch($_, $pattern)) {
$arr += [Regex]::Match($_, $pattern)
}
}
$arr | Foreach {$_.Value}
Expected output is
abcd
xyz
pqr
But it results in
<tr><td>abcd </td><td>
<tr><td>xyz </td><td>
<tr><td>pqr </td><td>
Can anyone mention why the tags are also getting as output and how to avoid this. Also I want to append text to each array elements eg: <a href="\\192.116.1.2\cluster_110">abcd, <a href="\\192.116.1.3\cluster_110">xyz etc, please mention the same as it involves special characters.