3

I am trying to use MSXML2 and IHTMLDocument to deal with iframe part of HTML web page.

I want to use MSXML2 and save it to better "capture" the data, thinking it's faster than just using InternetExplorer or VBA selenium reference supported by VBA menu. (I don't want to avoid using IE or selenium as much as possible)

But I couldn't find out how to save document as XML format(to take advantage of its speed) and at the same time click on the element in the document without the help of browser(ie or selenium). And even after clicking some tab(id="cns_Tab21") on this web page, I have difficulty retrieving data.
So my question is.. 1> Is it possible to minimize the use of browser for clicking?

2> Even after clicking(using Selenium), it throws an xpath related error in VBA editor.

Thank you for your answer in advance and the URL used for this is http://bitly.kr/finance and the iframe inside the link is http://bitly.kr/LT0aCb

    'I declared objects
    Dim XMLReq As New MSXML2.XMLHTTP60
    Dim HTMLDoc As New MSHTML.HTMLDocument
    Dim iframeDoc As IHTMLDocument

    'and saved XML data to HTML format
     HTMLDoc.body.innerHTML = XMLReq.responseText

    'and trying to save this HTML to iframe...
    Set iframeDoc = HTMLDoc.getElementById("coinfo_cp")
    'I tried .contentDocument but it maybe HTMLdoc doesn't have this property. 

     and I don't know how to access information I saved to iframeDoc above. 



      'And after I use Selenium I can't figure out why it throw an error
       For Each ele In selenium.FindElementsByTag("th")
        If ele.Attribute("innerText") = "CAPEX" Then
        Debug.Print ele.FindElementsByXPath("./../td").Attribute("innerText")

This post isn't a duplicate since I am trying to use XML to handle iframe element and without InternetExplorer reference in VBA Excel.(ie.document)

5
  • What is the url please? Commented Apr 12, 2019 at 10:27
  • Possible duplicate of Accessing object in iframe using VBA Commented Apr 12, 2019 at 11:44
  • @QHarr thank you. bitly.kr/finance it's not an English page but there is an iframe tag in the middle Commented Apr 13, 2019 at 14:39
  • There are quite a few iframes. Which do you want? What data are you after? Commented Apr 13, 2019 at 14:48
  • @QHarr sorry I gave you the wrong URL ; the page URL is bitly.kr/RWz5xand the iframe id is "coinfo_cp" Commented Apr 15, 2019 at 15:28

1 Answer 1

4

You can make replicate the xhr request the page makes when that tab (not iframe) is selected. I use clipboard to copy table to Excel. Note: url I am using is from our discussions. This info should be reflected in question.

Option Explicit
Public Sub GetTable()
'VBE > Tools > References > Microsoft HTML Object Library
    Dim html As HTMLDocument, hTable As HTMLTable, clipboard As Object
    Set html = New HTMLDocument

    With CreateObject("MSXML2.XMLHTTP")
        .Open "GET", "https://navercomp.wisereport.co.kr/v2/company/ajax/cF1001.aspx?cmp_cd=005930&fin_typ=0&freq_typ=Y&encparam=ZXR1cWFjeGJnS1lWOHhCYmNScmJXUT09&id=bG05RlB6cn", False
        .setRequestHeader "User-Agent", "Mozilla/5.0"
        .send
        html.body.innerHTML = .responseText
    End With

    Set hTable = html.querySelector(".hbG05RlB6cn + .gHead01")
    Set clipboard = GetObject("New:{1C3B4210-F441-11CE-B9EA-00AA006B1A69}") ' New DataObject
    clipboard.SetText hTable.outerHTML
    clipboard.PutInClipboard
    ThisWorkbook.Worksheets("Sheet1").Cells(1, 1).PasteSpecial
End Sub

You can find the params of the ajax url for the tab content update in the scripts of the page

enter image description here

Along with the target for the update:

enter image description here


This needs tidying up:

Option Explicit
Public Sub GetTable()
'https://navercomp.wisereport.co.kr/v2/company/c1010001.aspx?cmp_cd=005930
'VBE > Tools > References > Microsoft HTML Object Library
    Dim html As HTMLDocument, hTable As HTMLTable, clipboard As Object, ws As Worksheet
    Set ws = ThisWorkbook.Worksheets("Sheet1")
    Set html = New HTMLDocument

    With CreateObject("MSXML2.XMLHTTP")
        .Open "GET", "https://navercomp.wisereport.co.kr/v2/company/ajax/cF1001.aspx?cmp_cd=005930&fin_typ=0&freq_typ=Y&encparam=ZXR1cWFjeGJnS1lWOHhCYmNScmJXUT09&id=bG05RlB6cn", False
        .setRequestHeader "User-Agent", "Mozilla/5.0"
        .send
        html.body.innerHTML = .responseText

    End With

    Set hTable = html.querySelector(".hbG05RlB6cn + .gHead01") '2nd tab. CAPEX row

    Dim html2 As HTMLDocument, i As Long

    Set html2 = New HTMLDocument
    html2.body.innerHTML = hTable.outerHTML

    Dim tableBodyRows As Object, tableBodyRowLength As Long, tableHeaderRowLength As Long, tableHeaderRows As Object, targetRow As Long

    Set tableBodyRows = html2.querySelectorAll("tbody tr .bg")
    tableBodyRowLength = tableBodyRows.Length
    tableHeaderRowLength = html2.querySelectorAll("thead tr").Length + 2

    For i = 0 To tableBodyRowLength - 1
        If Trim$(tableBodyRows.item(i).innerText) = "CAPEX" Then
            targetRow = i + tableHeaderRowLength + 1
            Exit For
        End If
    Next

    Set clipboard = GetObject("New:{1C3B4210-F441-11CE-B9EA-00AA006B1A69}") ' New DataObject
    clipboard.SetText hTable.outerHTML
    clipboard.PutInClipboard
    ws.Cells(1, 1).PasteSpecial

    Dim unionRng As Range

    For i = (tableHeaderRowLength + 1) To (tableBodyRowLength + tableHeaderRowLength)
        If i <> targetRow Then
            If Not unionRng Is Nothing Then
                Set unionRng = Union(ws.rows(i), unionRng)
            Else
                Set unionRng = ws.rows(i)
            End If
        End If
    Next
    If Not unionRng Is Nothing Then unionRng.Delete
End Sub
Sign up to request clarification or add additional context in comments.

17 Comments

Hi, I do not see that id in the html for that link Where is it? Please update your question to exactly what you want.
I have updated the answer to handle the tab situation. Please try it.
Has the above answered the question?>
I think my answer needs refining. I will continue to look at it today.
The url I got from the dev tools network tab in Chrome when clicking on the tab in question. The xhr appears in the network list. The querySelector css selector I wrote myself by looking at the html, developer.mozilla.org/en-US/docs/Web/CSS/CSS_Selectors, the getObject that is the reference for clipboard object - you can google it
|

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.