0

As with many other questions, I'm really new to using Excel macros and HTML.

I'm building a table with entries of items from Dungeons and Dragons (DnD) that I want to get descriptions of from a particular web page: http://www.5esrd.com/gamemastering/magic-items/ . For example, An entry in my table would be Adamantine Armor, so I would want the macro to search the web page for that listing, scrape the description that's entered on the webpage and save that to the excel document. In this case, that specific description (as per the 5esrd webpage) is as follows:

Armor (medium or heavy, but not hide), uncommon

This suit of armor is reinforced with adamantine, one of the hardest substances in existence. While you’re wearing it, any critical hit against you becomes a normal hit.

On this page, all the items are links to pages that have them grouped by type (Sword, wand, etc.). I would like to iterate through my row of items in Excel, search for that item on the page and scrape its description into my excel table.

I was following along this page: https://www.wiseowl.co.uk/blog/s393/scrape-website-html.htm, which scrapes data from stackoverflow's home page. It directly targets some of the tables that the web page uses to organize its entries. The DnD page I'm trying to use doesn't organize its entries into tables like this, so I'm a little lost with how to proceed.

If anyone could help point me in the right direction, I would be very grateful!

1
  • Thanks for the advice Shahin! I've added a few more details of what exactly i'm looking for, I hope that helps explain what I'm looking to do. Commented Oct 18, 2017 at 18:28

1 Answer 1

1

This is the code I use for scraping a web page on our company site. I put all my data in column B, so make adjustments accordingly. This should get you started.

Sub TestScrape()
'   SCRAPE A WEB PAGE
Dim ieDoc As New HTMLDocument 'ieDocDocument
Dim tdCollection As Object 'table that has the javascript attributes and contains the element I want to click
Dim AnchorLinks As Object
Dim tdElements As Object
Dim tdElement As Object
Dim AnchorLink As Object
Dim lRow As Long
Dim ie As InternetExplorer
Dim cls As IHTMLElementCollection

Set ie = New SHDocVw.InternetExplorer
 With ie
   .Visible = True
   .Left = 0
   .TheaterMode = True   '<<-- Comment this out if you don't want Theater Mode
   .Navigate url:="Enter your URL here"

   While .ReadyState < 4 Or .Busy:  DoEvents:  Wend

 End With

        Application.Wait Now + TimeSerial(0, 0, 4)


ieDoc.body.innerHTML = ie.Document.body.innerHTML

With ieDoc.body
    Set AnchorLinks = .getElementsByTagName("table")
    Set tdElements = .getElementsByTagName("td") '

    For Each AnchorLink In AnchorLinks
        Debug.Print AnchorLink.innerText
    Next AnchorLink

End With

lRow = 1
   For Each tdElement In tdElements
      Debug.Print tdElement.innerText
      Cells(lRow, 2).Value = tdElement.innerText
      lRow = lRow + 1
   Next
     Application.Wait (Now + TimeValue("0:00:1"))
 With ie
  .TheaterMode = False
  .Quit
 End With
Set ie = Nothing

End Sub
Sign up to request clarification or add additional context in comments.

1 Comment

Thanks, I'm going to give this a swing and see how I can play with it to make it fit what I need.

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.