1

I want to pass the url of a webpage containing a <span id="spanID"> value </span> tag to a method like setTextBoxText(string url, string id) which is written in a wpf application codeBehind (MainWindow.xaml.cs) and set the Text of a specific TextBox Control to the span value, without loading the webpage. (for Ex. tracking price of a product in amazon)

I prefer to execute JavaScript code to get value of html elements and set the content of wpf controls to the result of the js code (function)

something like this:

public partial class MainWindow : Window
{
    string url = "https://websiteaddress.com/rest";
    setTextBoxText(url, "spanID");

    static void setTextBoxText(string url, string id)
    {
        // code to get document by given url
        txtPrice.Text = getHtmlElementValue(id);
    }

    string getHtmlElementValue(string id)
    {
        // what code should be written here?
        // any combination of js and c#?
        // var result = document.getElementById(id).textContent;
        // return result;
    }
}

1 Answer 1

1

You can use the HttpClient to load the HTML content of an URL and then process the DOM object in a JavaScript like syntax by wrapping the response into a mshtml.HTMLDocument - requires reference to Microsoft.mshtml.dll:

private mshtml.HTMLDocument HtmlDocument { get; set; }

private async Task SetTextBoxTextAsync(string url, string id)
{
  await UpdateHtmlDocumentAsync(url);
  var value = GetHtmlElementValueById(id);
  txtPrice.Text = value;
}

public async Task UpdateHtmlDocumentAsync(string url)
{
  using (HttpClient httpClient = new HttpClient())
  {
    byte[] response = await httpClient.GetByteArrayAsync(url);
    string httpResponseText = Encoding.GetEncoding("utf-8").GetString(response, 0, response.Length - 1);
    string htmlContent = WebUtility.HtmlDecode(httpResponseText);

    this.HtmlDocument = new HTMLDocument();
    (this.HtmlDocument as IHTMLDocument2).write(htmlContent);
  }
}

public string GetHtmlElementValueById(string elementId) 
  => this.HtmlDocument.getElementById(elementId).innerText;
Sign up to request clarification or add additional context in comments.

9 Comments

How do i know if document is completely Loaded? this line: return this.HtmlDocument.getElementById(elementId).innerText; threw System.NullReferenceException: 'Object reference not set to an instance of an object.'
I've posted a different implementation originally. I refactored this solution and forgot to adjust the method type. I have fixed the return type of UpdateHtmlDocumentAsync. This is also the method used to load the HTML document. You have to await this method. When UpdateHtmlDocumentAsync returns, the document (assigned to the property HtmlDocument) is loaded and ready to process.
Where is this exception thrown, inside GetHtmlElementValueById? The code works for me. this.HtmlDocumentcan return null if the URL doesn't resolve.
Sorry, I think I was accidentally observing the wrong event. I have changed the code to observe WebBrowser.LoadCompleted event instead. This should fix your exception.
Yes inside GetHtmlElementValueById. but url i entered is a valid webpage!
|

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.