I have seen this question posted before, but my situation is somewhat different so I was hoping I could get some help from the community and maybe a fresh perspective. I have a macro written in vba that's supposed to pull return data from this company's online db, things like returns for MSCI World Index, S&P 500, etc. The code I have works on other pages, but I think this one is different. I talked to the webmaster and he told me that the code was not designed to be scraped, but that is not restricted by their usage policy. It would be a huge time saver for me if I could, indeed, get the data by scraping so I'm trying really hard to figure out a way to do it. I've tagged this under java-script as well, as I think the code would be very similar and I want to accept as many solutions as possible to solve this problem.
The situation is this: I have the following code that throws me an "Object variable not set" error when it comes to the actual scraping of the data (the line that begins 'set els = htmlDoc...." I've tried many combinations of the getElement(s) function thinking that may have been the problem, but I've drawn a blank. Anybody know any other ways to set the object variable in this environment? Or just any other creative ways to pull the data.
I can't give out the login info, but I think by just navigating to the 'caRetPage' site, you can see the html code that I'm trying to scrape/parse.
Sub caScrape()
Dim ie As Object 'ie: internet explorer
Dim htmlDoc As MSHTML.HTMLDocument
Dim els As Object 'to store html objects
Dim rtn As String 'to store values to be scraped from page
Dim loginButton As Object
caLoginPage = "https://members.cambridgeassociates.com/Login/Forms/login-form.asp"
caRetPage = "https://members.cambridgeassociates.com/markets/marketindexsnapshot/DailyMarketReturnsUS.asp"
caUser = "xxxxx"
caPass = "xxxxx"
Set ie = CreateObject("internetexplorer.application")
ie.Visible = True
ie.navigate caLoginPage
While ie.Busy
DoEvents
Wend
Do Until ie.readyState = 4
DoEvents
Loop
Set htmlDoc = ie.document
'Log in to site
Set loginButton = htmlDoc.getElementsByTagName("button").Item(0)
With htmlDoc
.all("Username").Value = caUser
.all("Password").Value = caPass
loginButton.Click
End With
While ie.Busy
DoEvents
Wend
Set acceptButton = htmlDoc.getElementsByName("Submit").Item(0)
acceptButton.Click
While ie.Busy
DoEvents
Wend
'Here is the page with the return data on it
ie.navigate caRetPage
While ie.Busy
DoEvents
Wend
Do Until ie.readyState = 4
DoEvents
Loop
Set htmlDoc = ie.document
'This next line is where the error gets thrown
Set els = htmlDoc.getElementById("tblData")(0).getElementByTagName("tr")(5).getElementByTagName("td")(1)
'Also tried the following and plenty of variations of getElement command
'Set els = htmlDoc.getElementsByTagName("body")(0).getElementsByTagName("table")(2).getElementsByTagName("tbody")(0).getElementByTagName("tr")(5).getElementByTagName("td")(1)
rtn = els.innerText
Debug.Print(rtn)
End Sub
Any help would be greatly appreciated.
getElementById("tblData")will always return a single element (assuming a match was made), not a collection/list, so you don't need the(0)following that.getElementByTagNameis not a valid method: it's getElementsByTagName (though you have it correct in the commented-out lines). It's impossible to really help here, without the HTML source or an accessible URL>