0

I have this:

<div class="ResultItem">
<table border="0" cellpadding="0" cellspacing="0" style="top: 0; left: 0; width: 100%;">
    <tr>
        <td class="result">
            <a href="http://msdn.microsoft.com/en-us/library/system.windows.uielement.aspx" onclick="trackClick(this, '117', 'http\x3a\x2f\x2fmsdn.microsoft.com\x2fen-us\x2flibrary\x2fsystem.windows.uielement.aspx', '1');"><b>UIElement</b> Class &#40;System.Windows&#41;</a>&nbsp;
            <div class="ResultDescription"><b>UIElement</b> is a base class for WPF core level implementations building on Windows Presentation Foundation &#40;WPF&#41; elements and basic presentation characteristics.</div>
            <div class="ResultUrl">msdn.microsoft.com&#47;en-us&#47;library&#47;sy<wbr><a class="wbr"></a>stem.windows.<b>uielement</b>.aspx</div>
        </td>
    </tr>
</table>
</div>

I want to extract data from the <a>(grab this string)</a> and <div class="ResultDescription">(grab data</div>. How would I do this?

5 Answers 5

3

Your best bet long term is to use a dedicated HTML parsing library not custom string manipulation. There's a trunk version of HtmlAgilityPack called HAPPhone that works on Windows Phone 7. You will have to download it manually from codeplex, but it still beats having to write it yourself.

Sign up to request clarification or add additional context in comments.

Comments

1

If your goal is to read the MSDN website, they have an actual web service API for that

http://services.msdn.microsoft.com/ContentServices/ContentService.asmx

So screen scraping isn't necessary. Just add a reference to that URL.

Comments

0

If, (and only if!), your html is a valid XHTML you can use any XML parser to get what you want.

Comments

0

To reiterate what BrokenGlass mentioned, the overwhelming answer to What is the best way to parse html in C#? is to use libraries like HtmlAgilityPack, for the phone this would mean things like HAPPphone

Comments

0

If your parsing task is just for small length string then , You can parse string with 'html' content using javascript. following line of code will use regular expression to replace html tags and provides normal text.

//Javascript
var normal_text = html_string.replace(/(<.*?>)/ig,"");

Comments

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.