1

first time posting here so hoping to get some good feedback!

I am trying to automate the process of retrieving data from the following website: https://hazards.atcouncil.org/#/seismic?lat=38.461982&lng=-122.425394&address=

The parameters that are returned are based on the Reference Document, Risk Category, and Site Class drop-down values. Using Excel VBA, I have been able to navigate to the web page, select the desired items from the drop-down, and read the parameters that are in the output. However, when I make my selection, the web page does not update the output values (the parameters), so really I'm just reading the default parameters. It seems to be waiting for me to say "GO" or some kind of 'event', but I don't know enough about HTML-oriented VBA to figure out what to tell it. I've searched around a lot before coming here to post and I've found similar issues that other people have had, but it appears the structure of the HTML code for the drop-downs is just different on this site from what I've seen elsewhere. I'm using Internet Explorer 11.

I'm really hoping this is a simple fix. Thanks in advance for any help!

Here's my code (Excel 2016):

Sub ScrapeData()
Dim objIE As Object
Dim Latitude As Double
Dim Longitude As Double
Dim newHour As Variant
Dim newMinute As Variant
Dim newSecond As Variant
Dim waitTime As Variant
Dim valArray() As String
Dim btnSelect As MSHTML.HTMLSelectElement
Dim btnOption As MSHTML.HTMLOptionElement, ElementCol As MSHTML.IHTMLElementCollection
Dim ElementCol1 As MSHTML.IHTMLElementCollection

'Define the latitude and longitude
Latitude = 38.221565
Longitude = -122.46558

'Create the Internet Explorer object
Set objIE = CreateObject("InternetExplorer.Application")
objIE.Top = 0
objIE.Left = 0
objIE.Visible = True

'This will navigate to the website given the latitude and longitude
objIE.navigate ("https://hazards.atcouncil.org/#/seismic?lat=" & Latitude & "&lng=" & Longitude & "&address=")

'wait here while the browser is busy
Do While objIE.Busy = True Or objIE.readyState <> 4: DoEvents: Loop

'This is a designated wait time to allow it to finish loading because sometimes it's not ready
newHour = Hour(Now())
newMinute = Minute(Now())
newSecond = Second(Now()) + 3
waitTime = TimeSerial(newHour, newMinute, newSecond)
Application.Wait waitTime

'Bring the web page to the front
objIE.Visible = True

'Select Design Code Requirements
'Gather all the elements under tag name "option"

Set ElementCol = objIE.document.getElementById("seismic-selector").getElementsByTagName("option")
'Look at the value of each element in ElementCol
For Each btnSelect In ElementCol
    '******************************************************************************
    'This is where I'm having the issue!
    '******************************************************************************
    'If the value is equal to what I'm looking for, then...
    If btnSelect.innerText = "ASCE7-10" Then
        'I need to select this value, but I also need to trigger the web page here
        'I need to tell it "This is what I want, start retrieving information"
        'Instead, it selects the value from the drop down, but it appears to be waiting
        'for me to tell it to "Go"
        'The .Focus and .FireEvent don't appear to do anything
        btnSelect.Focus
        btnSelect.Selected = True
        btnSelect.FireEvent ("onchange")

        'Wait for the web page to update
        newHour = Hour(Now())
        newMinute = Minute(Now())
        newSecond = Second(Now()) + 3
        waitTime = TimeSerial(newHour, newMinute, newSecond)
        Application.Wait waitTime
    'I also need to select these other items from their drop downs
    ElseIf btnSelect.innerText = "IV" Then
        btnSelect.Selected = True
    ElseIf btnSelect.innerText = "D - Stiff Soil" Then
        btnSelect.Selected = True
    End If
Next btnSelect

Dim divElm3 As MSHTML.HTMLDivElement
Dim ElementCol3 As MSHTML.IHTMLElementCollection

Set ElementCol3 = objIE.document.getElementsByClassName("table-row")
i = 1
        For Each divElm3 In ElementCol3
            'The values have return carriages in them, this splits it up by the return carriage (vbLf)
            valArray() = Split(divElm3.innerText, vbLf)
            For j = 1 To (UBound(valArray()) + 1)
                'This puts the values into the worksheet on the "Test" page
                Worksheets("Test").Cells(i, j).Value = Application.Clean(Trim(valArray(j - 1)))
            Next j
            'i will be equal to the number of data values on the web page
            i = i + 1
            'For some reason this pulls everything twice, so I limit it to 20. If you comment
            'this If statement out, you will see what I mean.
            If i > 20 Then
                GoTo EndSub
            End If
        Next divElm3

EndSub:
End Sub

Here's the relevant portion of the HTML code:

<div id="seismic-selector">
<div class="form-group">
<span class="label">Reference Document</span>
<select>
<option value="asce7-16">ASCE7-16</option>
<option value="asce7-10">ASCE7-10</option>
<option value="asce7-05">ASCE7-05</option>
<option value="asce41-17">ASCE41-17</option>
<option value="asce41-13">ASCE41-13</option>
<option value="nehrp-2015">NEHRP-2015</option>
<option value="nehrp-2009">NEHRP-2009</option>
<option value="ibc-2015">IBC-2015</option>
<option value="ibc-2012">IBC-2012</option>
</select>
</div>
<div class="form-group">
<span class="label">Risk Category</span>
<select>
<option value="I">I</option>
<option value="II">II</option>
<option value="III">III</option>
<option value="IV">IV</option>
</select>
</div>
<div class="form-group">
<span class="label">Site Class</span>
<select>
<option value="A">A - Hard Rock</option>
<option value="B">B - Rock</option>
<option value="C">C - Very Dense Soil and Soft Rock</option>
<option value="D">D - Stiff Soil</option>
<option value="E">E - Soft Clay Soil</option>
<option value="F">F - Site Response Analysis</option>
</select>
</div>
<div class="form-group">
<span class="label">Report Title</span>
<input type="text" value="" placeholder="Enter a title..."></div></div>
2
  • Shot in the dark, but after making your selection trying throwing in a DoEvents after that line. Commented Aug 9, 2018 at 21:18
  • I just tried it, and no luck =/ Thank you though for looking into it. Commented Aug 9, 2018 at 21:37

1 Answer 1

1

Selenium:

Here is a version using selenium basic as the page responds to an automated browser selecting items. I have gone with the lat and long directly in the URL for this principles example. It is to show you the basic how. It is easy enough to concatenate these values in during a loop if required.

It is a somewhat odd page and was interesting in terms of writing out the tables.

After downloading selenium you need to go to VBE > Tools > References and add a reference to Selenium Type Library. Some other browsers are supported including IE and FireFox.

Apologies for the size of the images - I tried to make them smaller by adding suffix s |m on the link end but s was too small.

Option Explicit
Public Sub GetInfo()
    Dim d As WebDriver
    Set d = New ChromeDriver
    Const url = "https://hazards.atcouncil.org/#/seismic?lat=38.221565&lng=-122.46558&address="
    Application.ScreenUpdating = False
    With d
        .AddArgument "--headless"
        .Start "Chrome"
        .get url

        With .FindElementsByCss("#seismic-selector select")
            .item(1).AsSelect.SelectByText "ASCE7-10"
            .item(2).AsSelect.SelectByText "II"
            .item(3).AsSelect.SelectByText "D - Stiff Soil"
        End With

        Dim tables As WebElements
        Do
            Set tables = .FindElementsByClass("table", timeout:=7000)
        Loop While tables.Count = 0

        Dim table As Object, tr As Object, td As Object, r As Long, c As Long
        Dim ws As Worksheet, headers()
        headers = Array("Name", "Value", "Description")
        Set ws = ThisWorkbook.Worksheets("Sheet1")
        With ws
            For Each table In tables
                If Not table.Text = vbNullString Then
                    r = GetLastRow(ws, 1) + 2
                    .Cells(r, 1).Resize(1, UBound(headers) + 1) = headers

                    For Each tr In table.FindElementsByClass("table-row")
                        r = r + 1: c = 0
                        For Each td In tr.FindElementsByTag("div")
                            c = c + 1
                            .Cells(r, c) = td.Text
                        Next
                    Next
                End If
            Next
        End With
        .Quit
        Application.ScreenUpdating = True
    End With
End Sub

Public Function GetLastRow(ByVal ws As Worksheet, Optional ByVal columnNumber As Long = 1) As Long
    With ws
        GetLastRow = .Cells(.Rows.Count, columnNumber).End(xlUp).Row
    End With
End Function

Sample from webpage:

sample


Sample worksheet write-out:

sheet


Internet Explorer (less ideal):

Option Explicit
Public Sub ScrapeData()
    Dim objIE As Object, Latitude As Double, Longitude As Double, dropDowns As Object

    Latitude = 38.221565: Longitude = -122.46558
    Set objIE = CreateObject("InternetExplorer.Application")
    Application.ScreenUpdating = True

    With objIE
        '        .Top = 0
        '        .Left = 0
        .Visible = True
        .navigate ("https://hazards.atcouncil.org/#/seismic?lat=" & Latitude & "&lng=" & Longitude & "&address=")

        Do While .Busy = True Or .readyState <> 4: DoEvents: Loop
        Set dropDowns = .document.querySelectorAll("#seismic-selector select")

        With dropDowns
            .item(0).Focus
            SendKeys "{down}"
            .item(1).Focus
            SendKeys "{down}"
            .item(2).Focus
            SendKeys "{down 3}"
        End With

        Dim tables As Object, table As Object, tr As Object, td As Object, r As Long, c As Long, ws As Worksheet, headers()
        headers = Array("Name", "Value", "Description")
        Set ws = ThisWorkbook.Worksheets("Sheet1")
        Do
            DoEvents
            Set tables = .document.getElementsByClassName("table")
        Loop While tables.Length = 0
        With ws
            For Each table In tables
                If Not table.innerText = vbNullString Then
                    r = GetLastRow(ws, 1) + 2
                    .Cells(r, 1).Resize(1, UBound(headers) + 1) = headers

                    For Each tr In table.getElementsByClassName("table-row")
                        r = r + 1: c = 0
                        For Each td In tr.getElementsByTagName("div")
                            c = c + 1
                            .Cells(r, c) = td.innerText
                        Next
                    Next
                End If
            Next
        End With
        .Quit
    End With
    Application.ScreenUpdating = True
End Sub
Public Function GetLastRow(ByVal ws As Worksheet, Optional ByVal columnNumber As Long = 1) As Long
    With ws
        GetLastRow = .Cells(.Rows.Count, columnNumber).End(xlUp).Row
    End With
End Function
Sign up to request clarification or add additional context in comments.

12 Comments

Good approach as always.
Looks like a beautiful solution, thank you so much for putting forth the effort in assisting me. However, my work IT department doesn't feel comfortable allowing me to install Selenium... Would you happen to have any other ideas which could be done without needing to install any other software? Again, thank you very much for your time.
It could just be that I requested permission getting it from github (since that was the link in the response). They said when they downloaded it their antivirus immediately quarantined it. Is there a more (should I say trusted?) source for getting selenium basic?
No that is the official and likely safest source. Why was it quarantined? It may require admin permissions to install as it makes changes to the computer. It is an add-in wrapper so links in with your vba applications. It is very useful for the tougher scrapes and allows you to use a wider range of browsers. Can they not check the quarantine report and if required raise an issue on the GitHub site? The author is pretty amazing and responsive AFAIK.
The reason given from the quarantine pop-up was "SeleniumBasic-master.zip has been detected as Xls.Dropper.Generic::other.talos. Quarantine was successful." and that is all I know in that regard. Do you know if the entire download is necessary? It seems like there are some "extras" in there that may not be necessary for the full functionality, and perhaps something in there is causing it to be quarantined? Thank you again for your help!
|

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.