1

I want to import the html from a web page and then parse it and retrieve http links from the elements. I am able to grab the html and put it in a string. Also, if I have the html in xml format, I am able to do a for each loop to retrieve the data. But I am not able to figure out how to take the html string and make it readable by LINQ. I think I'm missing some simple part here.

Sub GetTest()
        Dim source As String = "http://gd2.mlb.com/components/game/mlb/year_2018/month_03/day_29/"
        Dim Client As New WebClient
        Dim html As String = Client.DownloadString(source)

        Dim xml = XElement.Parse(html)

        Dim links = From link In xml...<a>

        For Each link In links
            MessageBox.Show(link.@href)
        Next
    End Sub
6
  • How is this different from your last question? Commented Jan 13, 2018 at 17:54
  • My last question was answered. I was able to parse the xml from a different page. This one is html and it doesn't seem to work the same. Commented Jan 13, 2018 at 17:56
  • 2
    You should install HtmlAgilityPack via NuGet, Commented Jan 13, 2018 at 17:58
  • 1
    stackoverflow.com/q/516811/1070452 Commented Jan 13, 2018 at 18:00
  • 1
    Html is not standard Xml. Things like unclosed tags <br> and also simple javascript expressions like a < 5 can break the parser. That's why HtmlAgilityPack exists. Commented Jan 13, 2018 at 18:02

1 Answer 1

1

This page can be parsed as Xml after getting rid of first unclosed tag:

Dim xml = XElement.Parse(html.Substring(html.IndexOf(">") + 1))
For Each link In xml.Descendants("a")
    Console.WriteLine(link.Attribute("href"))
Next

In general there are multiple issues when trying to parse Html as if it was standard Xml. So it is better to use HtmlAgilityPack.

Sign up to request clarification or add additional context in comments.

Comments

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.