0

I'm building ASP.NET MVC2 website that lets users store and analyze data about goods found on various online trade sites. When user is filling a form to create or edit an item, he should have a button "Import data" that automatically fills some fields based on data from third party website.

The question is: what should this button do under the hood?

I see at least 2 possible solutions.
First. Do the import on client side using AJAX+jQuery load method.
I tried it in IE8 and received browser warning popup about insecure script actions. Of course, it is completely unacceptable.
Second. Add method ImportData(string URL) to ItemController class. It is called via AJAX, does the import + data processing server-side and returns JSON-d result to client.
I tried it and received server exception (503) Server unavailable when loading HTML data into XMLDocument. Also I have a feeling that dealing with not well-formed HTML (missing closing tags, etc.) will be a huge pain. Any ideas how to parse such HTML documents?

2
  • The first won't work due to cross domain browser restrictions. As far as the second is concerned, could you show some sample code? Commented Apr 18, 2010 at 7:31
  • Nevermind, I realised that HtmlAgility is much easier to use than plain XMLDocument and it does not produce errors. Commented Apr 18, 2010 at 8:32

2 Answers 2

0

Unfortunately you can't do cross-site loading usting JavaScript without using JSONP. This is a security issue. Your best bet would be to AJAX a request to one of your controller's actions and have it do the web request and return the result to the client.

As far as the 503 Server Unavailable goes, does this happen on every request? It sounds like you're parsing information from WoW Armory. They throttle web requests and will ban you after a certain about of time.

Sign up to request clarification or add additional context in comments.

1 Comment

It is not WoW Armory, I can load page html as string without errors. But when I try to load this string into XMLDocument for further parsing, I receive this error.
0

Use http://htmlagilitypack.codeplex.com/ to process HTML on server. Or regexps. Or string.IndexOf. Or import MSHTML via Interop library and use it. Do not load HTML into XML documents unless you're absolutely sure it's pure XHTML.

Also, try to see if 3rd party websites provide more direct access to data - XML, REST, web services.

Comments

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.