3

I am trying to create a macro that scrapes a cargo tracking website. But I have to create 4 such macros as each airline has a different website.

I am new to VBA and web scraping.

I have put together a code that works for 1 website. But when I tried to replicate it for another one, I am stuck in the loop. I think it maybe how I am referring to the element, but like I said, I am new to VBA and have no clue about HTML.

I am trying to get the "notified" value in the highlighted line from the image.

IMAGE:"notified" text to be extracted Below is the code I have written so far that gets stuck in the loop. Any help with this would be appreciated.

Sub FlightStat_AF()

Dim url As String
Dim ie As Object
Dim nodeTable As Object

  'You can handle the parameters id and pfx in a loop to scrape dynamic numbers
  url = "https://www.afklcargo.com/mycargo/shipment/detail/057-92366691"

  'Initialize Internet Explorer, set visibility,
  'call URL and wait until page is fully loaded
  Set ie = CreateObject("InternetExplorer.Application")
  ie.Visible = False
  ie.navigate url
  Do Until ie.readyState = 4: DoEvents: Loop
  
  'Wait to load dynamic content after IE reports it's ready
  'We can do that in a loop to match the point the information is available
  Do
    On Error Resume Next
    Set nodeTable = ie.document.getElementByClassName("block-whisper")
    On Error GoTo 0
  Loop Until Not nodeTable Is Nothing
  
  'Get the status from the table
  MsgBox Trim(nodeTable.getElementsByClassName("fs-12 body-font-bold").innerText)
  
  'Clean up
  ie.Quit
  Set ie = Nothing
  Set nodeTable = Nothing
End Sub
8
  • You are likely stuck in a loop because "block-whisper" is never found. When that is the case, your code will loop forever. The loop isn't really needed. You either find the element or you don't and you can take action either way. Commented Sep 4, 2020 at 14:06
  • @BrianMStafford, The loop is needed because it's a dynamic table that is loaded after the source page is loaded. On the element part, I am not sure if I am referring to the correct element as I don't know anything about HTML. Can you please look at the image attached and see what mistake I am making? Commented Sep 4, 2020 at 14:37
  • 1
    You try my code from this answer for another page: stackoverflow.com/questions/63738093/… That can't work. The code for scraping is different for every page because every page is different. I have now no time to look how it works for this page. Maybe later. Commented Sep 4, 2020 at 14:42
  • Try this link. You can use xhr to fetch what you want. Commented Sep 4, 2020 at 15:05
  • @SIM, Thank you for this. But I am absolutely new to web scraping and only have basic VBA knowledge. Would you be able help me with the code a little as well? Commented Sep 4, 2020 at 16:08

1 Answer 1

2

Some basics:
For simple accesses, like the present ones, you can use the get methods of the DOM (Document Object Model). But there is an important difference between getElementByID() and getElementsByClassName() / getElementsByTagName().

getElementByID() searches for the unique ID of a html tag. This is written as the ID attribute to html tags. If the html standard is kept by the page, there is only one element with this unique ID. That's the reason why the method begins with getElement.

If the ID is not found when using the method, VBA throws a runtime error. Therefore the call is encapsulated in the loop from the other answer from me, into switching off and on again the error handling. But in the page from this question there is no ID for the html area in question.

Instead, the required element can be accessed directly. You tried the access with getElementsByClassName(). That's right. But here comes the difference to getElementByID().

getElementsByClassName() and getElementsByTagName() begin with getElements. Thats plural because there can be as many elements with the same class or tag name as you want. This both methods create a html node collection. All html elements with the asked class or tag name will be listet in those collections.

All elements have an index, just like an array. The indexes start at 0. To access a particular element, the desired index must be specified. The two class names fs-12 body-font-bold (class names are seperated by spaces, you can also build a node collection by using only one class name) deliver 2 html elements to the node collection. You want the second one so you must use the index 1.

This is the VBA code for the asked page by using the IE:

Sub FlightStat_AF()

Dim url As String
Dim ie As Object

  'You can handle the parameters id and pfx in a loop to scrape dynamic numbers
  url = "https://www.afklcargo.com/mycargo/shipment/detail/057-92366691"

  'Initialize Internet Explorer, set visibility,
  'call URL and wait until page is fully loaded
  Set ie = CreateObject("InternetExplorer.Application")
  ie.Visible = False
  ie.navigate url
  Do Until ie.readyState = 4: DoEvents: Loop
  
  'Wait to load dynamic content after IE reports it's ready
  'We do that with a fix manual break of a few seconds
  'because the whole page will be "reload"
  'The last three values are hours, minutes, seconds
  Application.Wait (Now + TimeSerial(0, 0, 3))
  
  'Get the status from the table
  MsgBox Trim(ie.document.getElementsByClassName("fs-12 body-font-bold")(1).innerText)
  
  'Clean up
  ie.Quit
  Set ie = Nothing
End Sub

Edit: Sub as function

This sub to test the function:

Sub testFunction()
  Dim flightStatAfResult As String
  flightStatAfResult = FlightStat_AF("057-92366691")
  MsgBox flightStatAfResult
End Sub

This is the sub as function:

Function FlightStat_AF(cargoNo As String) As String

Dim url As String
Dim ie As Object
Dim result As String

  'You can handle the parameters id and pfx in a loop to scrape dynamic numbers
  url = "https://www.afklcargo.com/mycargo/shipment/detail/" & cargoNo

  'Initialize Internet Explorer, set visibility,
  'call URL and wait until page is fully loaded
  Set ie = CreateObject("InternetExplorer.Application")
  ie.Visible = False
  ie.navigate url
  Do Until ie.readyState = 4: DoEvents: Loop
  
  'Wait to load dynamic content after IE reports it's ready
  'We do that with a fix manual break of a few seconds
  'because the whole page will be "reload"
  'The last three values are hours, minutes, seconds
  Application.Wait (Now + TimeSerial(0, 0, 3))
  
  'Get the status from the table
  result = Trim(ie.document.getElementsByClassName("fs-12 body-font-bold")(1).innerText)
  
  'Clean up
  ie.Quit
  Set ie = Nothing
  
  'Return value of the function
  FlightStat_AF = result
End Function
Sign up to request clarification or add additional context in comments.

6 Comments

Thank You for this code! I am writing a similar code for another cargo tracking website. I have gotten as far as navigating to the page. But the problem is that the data table doesn't load at all in IE. It loads perfectly on chrome. Would you know the reason behind it? This is the site I am trying to scrape from - trackingmore.com/aircargo/72403698656
@AchalAesai The technology for loading the dynamic content of the page is too modern for IE. IE has not been developed further for some time and modern JavaScript can no longer be executed. I suppose you can only scrape the desired data if you use an interface like Selenium for it. With Selenium, browsers other than IE can be controlled from VBA. selenium.dev Another possibility is to use another language like Python, with appropriate libraries, like Beautifull Soap.
Thanks for clarifying this. I have no knowledge of python. But I will look into selenium as per your advice.
I am trying create the above code as a function. When I run it as a sub, it gives me the correct result. But when I run it as a function, it gives me an #Value error. Where am I going wrong? Below is the bode I have written to convert the sub to a function.
Thank you for the code. I tried it with the test sub and the msgbox gives the correct answer. but When I run it as a function =FlightStat_AF(H3) it again gives me a #value error.
|

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.