0

I am wondering why the response property of an MSXML2.ServerXMLHTTP object not returning the full html source. It appears it is only returning the "inner html". I can create an IE object and get the "outer html" but that is not too efficient since I have hundreds of search items.

I have the function shown below (with the URL) that assigns the HTML content to a string.

Sub test()
    Dim myString As String
    myString = getECICS2("103-90-2") ' myString only contains inner html
End Sub

Public Function getECICS(ByVal casNum As String) As String
  Dim XMLhttp: Set XMLhttp = CreateObject("MSXML2.ServerXMLHTTP")
  XMLhttp.setTimeouts 2000, 2000, 2000, 2000
  XMLhttp.Open "GET", "http://ec.europa.eu/taxation_customs/dds2/ecics/chemicalsubstance_consultation.jsp?Lang=en&Cas=" & casNum & "&Cus=&CnCode=&EcCode=&UnCode=&Name=&LangNm=en&Inchi=&Characteristic=&sortOrder=1&Expand=true&offset=0&range=25", False
  XMLhttp.send
  If XMLhttp.Status = 200 Then
    getECICS = XMLhttp.responseText
  Else
    getECICS = ""
  End If
End Function

Thanks in advance

5
  • I get exactly the same result whether I use your method or just "view source" in the browser. You should look at the source - there's a bunch of script at the top, before the opening <html> tag. Commented Feb 26, 2014 at 23:36
  • yes I know however I am not interested in this, the part that I am interested is at the bottom of the page, for example for this particular search it is "0021314-9" but this number does not appear using my method. Interestingly if i go to firefox Inspector > html > copy outer html, the clipboard contains the search results Commented Feb 27, 2014 at 0:30
  • How are you testing for the presence of that "0021314-9" ? If you're using debug.print you should be aware it only displays up to a maximum number of lines. Otherwise - please expand. Commented Feb 27, 2014 at 0:35
  • 2
    On second look - seems like the page you're looking at is dynamic: the content is added to after the page has loded, so you cannot use an XMLHttp approach to get at the data: you will need to use some kind of browser automation (such as automating IE). Commented Feb 27, 2014 at 0:48
  • well initially I had setup a RegEx pattern search on "myString", i was confused because I was not getting any match, so I wrote the content of "myString" into a text file, and simply searched for "0021314-9" with Notepad. Still no hits Commented Feb 27, 2014 at 0:49

2 Answers 2

1

Tim has hit the nail on the head. The webpage uses javascript to update the page once the html has been downloaded. This happens automatically in a browser.

If you run the code below it will dump the response into an html file which you can view in Chrome/IE/FF etc

Sub test()
    Dim myString As String
    myString = getECICS("103-90-2") ' myString only contains inner html
End Sub

Public Function getECICS(ByVal casNum As String) As String
  Dim XMLhttp: Set XMLhttp = CreateObject("MSXML2.ServerXMLHTTP")
  XMLhttp.setTimeouts 2000, 2000, 2000, 2000
  XMLhttp.Open "GET", "http://ec.europa.eu/taxation_customs/dds2/ecics/chemicalsubstance_consultation.jsp?Lang=en&Cas=" & casNum & "&Cus=&CnCode=&EcCode=&UnCode=&Name=&LangNm=en&Inchi=&Characteristic=&sortOrder=1&Expand=true&offset=0&range=25", False
  XMLhttp.send
  If XMLhttp.Status = 200 Then
    getECICS = XMLhttp.responseText
  Else
    getECICS = ""
  End If
  outputtext (getECICS)
End Function

Function outputtext(text As String)
Dim MyFile As String, fnum As String
        MyFile = ThisWorkbook.Path & "\" & "test.html"
        'set and open file for output
        fnum = FreeFile()
        Open MyFile For Output As fnum
        'use Print when you want the string without quotation marks
        Print #fnum, text
        Close #fnum
End Function

Unfortunately, the easiest solution is to run your automation in a browser or script enabled solution to get at the required data.

Many sites now use javascript/AJAX/Login sessions to control the speed and access to resources these days so you cannot always get the desired speed insreases by not using a browser.

Sign up to request clarification or add additional context in comments.

Comments

0

Have look at the other methods of XMLHttpRequest...

responseText returns the response body as text

responseXML returns the body as a DOM object

I think you are after: XMLhttp.response which returns the whole response.

or maybe: XMLhttp.responseBody ?

I'm not totally sure on this 'cos I've only used the C++ interface myself.

see: http://msdn.microsoft.com/en-us/library/windows/apps/hh453379.aspx#methods

2 Comments

thannks for your response but I am afraid using XMLhttp.response didnt change anything. I get the same content in the string.
it does, but it returned something odd, i think Tim Williams might have the answer you cannot use XMLhttp with a dynamic page.

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.