0

I am using MSXML2.XMLHTTP method for data extraction but unable to extract data from specific page

Currently using following code for data extraction from different pages.This code is working fine with other pages but not working proper for specific page. I want to extract following values for sample page.Price,Seller name etc

 Dim http As Object, html As New MSHTML.HTMLDocument, topics As Object, titleElem As Object, detailsElem As Object, topic As HTMLHtmlElement
Dim j As Long
Dim RowCount As String
Dim maxid As Long
Dim productdesc1 As String
Dim features As String
Dim news As String
Dim comb As String
t122 = Now
Rin = DMin("[id]", "url", "[Flag] = False")
If Not IsNull(Rin) Then
   Set http = CreateObject("MSXML2.XMLHTTP")
   'http = http.SetOption(2, 13056)
';  //ignore all SSL Cert issues
 RowCount = DMin("[id]", "url", "[Flag] = False")
 maxid = DMax("[id]", "url", "[Flag] = False")
 'MsgBox (RowCount)
 Do While RowCount <> ""
 'RowCount = DMin("[id]", "url", "[Flag] = False")
 url = DLookup("[url]", "url", "ID = " & ([RowCount]))
 url = Trim(url)
 t31 = ""
 t31 = (DateDiff("n", t122, Now))
 On Error Resume Next
 http.Open "GET", url, False
 http.Send
 html.body.innerHTML = http.ResponseText
 brand = html.body.innerText
 Set my_data1 = html.getElementsByClassName("a-row a-spacing-mini   olpOffer")
 i = 1
 For Each Item In my_data1
 pr1 = Item.getElementsByClassName("a-size-large a-color-price olpOfferPrice a-text-bold")
pr2 = pr1.innerText
dlmsg = Item.innerHTML
If dlmsg Like "*olpShippingPrice*" Then
dpr = Item.getElementsByClassName("olpShippingPrice")
dpr2 = dpr.innerText
End If

Data should be visible from following webpage using above code.https://www.amazon.co.uk/gp/offer-listing/B00551P0Q8

13
  • Is this being done in Access? Also, what exactly are you trying to retrieve from site? And do we need any test values to use? Commented Aug 1, 2019 at 13:31
  • I update my question and code to with sample values required.Please check.I am doing this ms access vba Commented Aug 1, 2019 at 15:29
  • I see offer price £49.43 and seller name Hanes. Is that what you expected to retrieve? Commented Aug 1, 2019 at 15:58
  • Yes,but this method is not working on that above page Commented Aug 1, 2019 at 16:00
  • This works for me: pastebin.com/MXpym4R0 Commented Aug 1, 2019 at 16:11

1 Answer 1

2

The following will print out all. You can sort where to write the values to

Option Explicit

Public Sub Test()

    Dim prices As Object, sellers As Object, html As HTMLDocument, i As Long

    Set html = New HTMLDocument
    With CreateObject("MSXML2.XMLHTTP")
        .Open "GET", "https://www.amazon.co.uk/gp/offer-listing/B01GK4YHMQ", False
        .Send
        html.body.innerHTML = .ResponseText
    End With
    Set prices = html.querySelectorAll(".olpOfferPrice")
    Set sellers = html.querySelectorAll(".olpSellerName a")

    For i = 0 To prices.Length - 1
        Debug.Print Trim$(prices.Item(i).innerText)
        Debug.Print Trim$(sellers.Item(i).innerText)
    Next
End Sub
Sign up to request clarification or add additional context in comments.

5 Comments

above code is working but having one issue.when we have multiple urls i.e more than 100k urls then processing is very slow.Is this possible that processing will be very fast.
I would suggest you take the working code and consider posting on code review site. Be sure to read their guidance on posting first. Have a quick look at some of the existing questions on that site that have received positive vote positions to get a feel for the difference from stackoverflow.
ok..thanks.Is this possible that i will use getelementsbyclassname method using above code because classname method not working in above code.
what do you mean? getelementsbyclassname is the correct name for the alternative. It requires the entire class name. As the class name is multivalued (there are multiple classes) using the full multi-values is more fragile. Also, it is likely that the querySelector is faster.
One last question,when i am running above code then after few url following message appears in html.response. "We are sorry and error occurred when we try to process your required.Kindly how to resolve this error.thanks.

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.