1

I've aged 5 years spending hours trying to solve this and spent hours and hours trying to understand it, so here goes :)

I am trying to extract some tables from this company page on Market Screener using the CreateObject method.

Taking table(25) as an example (this one) (screenshot, I am trying to extract the table "Type of business" and the first column listings the business types (not the 2016, 2017 and Delta columns).

I found a head-startonline in this 2016 stackoverflow thread

    Dim oDom As Object: Set oDom = CreateObject("htmlFile")
Dim x As Long, y As Long
Dim oRow As Object, oCell As Object
Dim vData As Variant
Dim link As String

link = "https://www.marketscreener.com/COLUMBIA-SPORTSWEAR-COMPA-8859/company/"

y = 1: x = 1

With CreateObject("msxml2.xmlhttp")
    .Open "GET", link, False
    .send
    oDom.body.innerHTML = .responseText
End With

With oDom.getElementsByTagName("table")(25)
    ReDim vData(1 To .Rows.Length, 1 To 11) '.Rows(1).Cells.Length)
    For Each oRow In .Rows
        For Each oCell In oRow.Cells
            vData(x, y) = oCell.innerText
            y = y + 1
        Next oCell
       y = 1
        x = x + 1
    Next oRow
End With


Sheets(2).Cells(66, 2).Resize(UBound(vData), UBound(vData, 2)).Value = vData

It sort-of works, but is returning a jumbled table with all the data in it in a single cell, like this, but jumbled into a single cell

I then found another tweak online, which was this, which suggests copy and pasting and letting Excel work out how to paste it in, which sort of works too:

With oDom.getElementsByTagName("table")(25)
    Dim dataObj As Object
    Set dataObj = CreateObject("new:{1C3B4210-F441-11CE-B9EA-00AA006B1A69}")
    dataObj.SetText "<table>" & .innerHTML & "</table>"
    dataObj.PutInClipboard
End With

Sheets(2).Paste Sheets(2).Cells(66, 1)

Which creates this result sort-of correctly, but not just the values - I am trying to paste special, without any formatting.

Driving me a bit nuts and get the concept but completely stuck at the moment. Is there a way to do it? I can replicate it on on tables on that page and other tabs then if I have a head-start.

Any help greatly appreciated,

Best Regards, Paul

2 Answers 2

1

Taking your given example you can use a combination of class and type (tag) to select those elements. Same logic applies for next table as well. The problem here is you really have to inspect the html and tailor what you do. Otherwise, the easy solution, which you didn't want, is to use the clipboard.

Option Explicit   
Public Sub GetTableInfo()
    Dim html As HTMLDocument
    Set html = New HTMLDocument                  '<  VBE > Tools > References > Microsoft Scripting Runtime
    With CreateObject("MSXML2.XMLHTTP")
        .Open "GET", "https://www.marketscreener.com/COLUMBIA-SPORTSWEAR-COMPA-8859/company/", False
        .send
        html.body.innerHTML = .responseText
    End With
    Dim leftElements As Object, td As Object
    '.tabElemNoBor.fvtDiv tr:nth-of-type(2) td.nfvtTitleLeft
    Set leftElements = html.getElementsByClassName("tabElemNoBor fvtDiv")(0).getElementsByTagName("tr")(2)
    For Each td In leftElements.getElementsByTagName("td")
        If td.className = "nfvtTitleLeft" Then
            Debug.Print td.innerText
        End If
    Next
End Sub
Sign up to request clarification or add additional context in comments.

1 Comment

Thankyou for your input as well QHarr. You have both given me a solution which shows me how it can be done and I am super grateful to you both. The other solution that the person that wrote a macro that he used but isn't able to share does it with this: xHttp.Open "GET", "marketscreener.com/COLUMBIA-SPORTSWEAR-COMPA-8859/company, False xHttp.Send I just noticed, the site returns a popup cookie page which needs "OK" clicking manually before it will return the actual data with any method - is there a way to do that with the code to 'accept' the cookie?
1

If you have Excel 2010+, you can do this using Power Query. You can set up a query to get this Data from the Web.

The PQ code would be:

let
    Source = Web.Page(Web.Contents("https://www.marketscreener.com/COLUMBIA-SPORTSWEAR-COMPA-8859/company/")),
    myData = Source{3}[Data],
    firstColumn = {List.First(Table.ColumnNames(myData))},
    #"Removed Other Columns" = Table.SelectColumns(myData,firstColumn),
    #"Removed Blank Rows" = Table.SelectRows(#"Removed Other Columns", each not List.IsEmpty(List.RemoveMatchingItems(Record.FieldValues(_), {"", null})))
in
    #"Removed Blank Rows"

This results in:

enter image description here

And the query can be refreshed, edited, etc.

As written, the query will keep the first column of the desired table. You can decide which table to process by changing the number in Source{n}. 3 happens to be the one you are interested in, but there are 11 or 12 tables, if I recall correctly.

3 Comments

for a much simpler method +1
Wow, thankyou for the fast reply. Is there a way to return the results that way without the Column1 header, and just as a one-off text return so I can put it into a cell with Range("A3").value = thevalues somehow?
@atom99 There is an option in the GUI to promote the first row to the header row. If you do that, there effectively would not be an real header. Or you could remove it with VBA code.

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.