I have some files that were displayed in a browse and then I used File, Save As.. to place the text in a local file. The page has some scripting and it will not display properly in a WebBrowserControl on a WinForm. The problem appears to be scripts as the control displays "script error" dialogs. I don't really need to view the file but to just retrieve a few elements by ID.
The first block of code below does load the file into a local object, but only the first 4096 bytes. (Same happens if I use a WebBrowser resident on the form.)
The second block doesn't complain but the GetElementByID fails as the desired element is beyond the first 4096.
Dim web As New WebBrowser
web.AllowWebBrowserDrop = False
web.ScriptErrorsSuppressed = True
web.Url = New Uri(sFile)
Dim doc As HtmlDocument
Dim elem As HtmlElement
doc = web.Document
elem = doc.GetElementById("userParts")
What am I doing wrong?
Is there a better approach for a VB.Net WinForm project for loading an HTML document from which I can read elements?
I just went with string functions for the simple task at hand:
Function GetInnerTextByID(html As String, elemID As String) As String
Try
Dim s As String = html.Substring(html.IndexOf("<body>"))
s = s.Substring(s.IndexOf(elemID))
s = s.Substring(s.IndexOf(">") + 1)
s = s.Substring(0, s.IndexOf("<"))
s = s.Replace(vbCr, "").Replace(vbLf, "").Trim
Return s
Catch ex As Exception
Return ""
End Try
End Function
I'd still be interested in a native VB.Net (non-ASP) approach. Or why the OP only loads 4096 bytes.
HtmlAgilityPackdocument.GetElementByIdmethod which is pretty simple. And it has no strange issues with scripts or bytes. Just load the document from web,file or plain string.