1

I am attempting to parse a html web page to get information from it. Here is a sample of the source:

<div class="market_listing_row market_recent_listing_row listing_2107979855708535333" id="listing_2107979855708535333">

<div class="market_listing_item_img_container">     <img id="listing_2107979855708535333_image" src="asdgfasdfgasgasgdasgasdgsdasgsadg" style="border-color: #D2D2D2;" class="market_listing_item_img" alt="" />    </div>
        <div class="market_listing_right_cell market_listing_action_buttons">
                <div class="market_listing_buy_button">
                            <a href="javascript:BuyMarketListing('listing', '2107979855708535333', 570, '2', '508690045')" class="item_market_action_button item_market_action_button_green">
                <span class="item_market_action_button_edge item_market_action_button_left"></span>
                <span class="item_market_action_button_contents">
                    Buy Now                 </span>
                <span class="item_market_action_button_edge item_market_action_button_right"></span>
                <span class="item_market_action_button_preload"></span>
            </a>
                        </div>
        </div>
    <div class="market_listing_right_cell market_listing_their_price">
    <span>
                    <span class="market_listing_price market_listing_price_with_fee">
            0,03 p&#1091;&#1073;.           </span>
        <span class="market_listing_price market_listing_price_without_fee">
            0,01 p&#1091;&#1073;.           </span>
        <br/>
                </span>

Basically I need to get the part that is enclosed in the

<span class="market_listing_price market_listing_price_with_fee">
        0,03 p&#1091;&#1073;.           </span>

I have attempted to use HTMLAgiltiyPack, but can't seem to figure it out.

2 Answers 2

1

You can use HtmlAgilityPack

var doc = new HtmlAgilityPack.HtmlDocument();
doc.LoadHtml(html);

var node = doc.DocumentNode
            .SelectSingleNode("//span[@class='market_listing_price market_listing_price_without_fee']");

var text = WebUtility.HtmlDecode(node.InnerText);
Sign up to request clarification or add additional context in comments.

3 Comments

Thank you. I think I was just using Market_listing_price instead of the full deal. I will check it when I get home from work. Thank you.
After getting home and firing this up, I am still getting an Object Reference error. I have changed it from SelectSingleNode to a foreach and SelectNodes since there are multiple nodes with that same span, all reporting a different value. It appears to not be finding the span, and returning null.
I have figured out the issue. It was how I was loading the html. I didn't realize I needed to create a HttpWebRequest and deal with it accordingly before passing it into the HtmlDocument.
0

I figured out you cannot just put a URL into the doc.LoadHtml. You have to use a HttpWebRequest and Response.

Comments

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.