How would I go about parsing HTML in C++ on my Webserver Application?
3 Answers
libxml2 has a HTML parser. libxml++ is a wrapper for libxml2, but I'm not sure if it exposes the HTMLparser functionality.
Comments
Hand parsing gets messy, even for relatively trivial cases.
Have you considered a Lexer/Parser, such as Flex/Bison? I highly recommend Antlr - and get AntlrWorks.
A picture is worth a thousand words, so this will tell you why - http://www.antlr.org/works/screenshots/editor.jpg