2

I've written some code using vba to get all the movie names from a specific webpage out of a torrent site. However, pressing "F8" I could find out that the code works well and prints the results until it hits the last result from that page. As soon as it reaches the last name to parse, the program crashes. I did several times and suffered the same consequences. If vba doesn't support this css selector method then how could I collect results before the last one? Is there any reference to add in the library or something else before execution? Any help on this will be vastly appreciated.

Here is the code I have written:

Sub Torrent_data()

    Dim http As New XMLHTTP60, html As New HTMLDocument
    Dim movie_name As Object, movie As Object

    With http
        .Open "GET", "https://www.yify-torrent.org/search/1080p/", False
        .send
        html.body.innerHTML = .responseText
    End With

    Set movie_name = html.querySelectorAll("div.mv h3 a")

    For Each movie In movie_name
        x = x + 1: Cells(x, 1) = movie.innerText
    Next movie

End Sub
1
  • Any error message ? Commented Aug 20, 2017 at 16:10

4 Answers 4

3

Try this:

Sub Torrent_data()

    Dim http As New XMLHTTP60, html As New HTMLDocument, x As Long
    
    With http
        .Open "GET", "https://www.yify-torrent.org/search/1080p/", False
        .send
        html.body.innerHTML = .responseText
    End With

    Do
    x = x + 1
    On Error Resume Next
    Cells(x, 1) = html.querySelectorAll("div.mv h3 a")(x - 1).innerText
    Loop Until Err.Number = 91
    
End Sub

This is another way which doesn't need error handler:

Sub GetContent()
    Const URL$ = "https://yify-torrent.cc/search/1080p/"
    Dim HTMLDoc As New HTMLDocument, R&, I&

    With New ServerXMLHTTP60
        .Open "Get", URL, False
        .send
        HTMLDoc.body.innerHTML = .responseText
    End With

    With HTMLDoc.querySelectorAll("h3 > a.movielink")
        For I = 0 To .Length - 1
            R = R + 1: Cells(R, 1).Value = .Item(I).innerText
        Next I
    End With
End Sub
Sign up to request clarification or add additional context in comments.

2 Comments

I really don't know. It might be an internal bug. Try putting exit sub or exit for inside the for each loop, delete everything else in it. You will see it still crashes. This made me think the problem is related to for each loop so I got rid of it, and here we are.
@Tehscript, found issue with querySelectorAll. posted details in second answer
2

the code retrieves one element after the last movie

this extra element causes the failure, so for each ... cannot be used

not sure why ... yet .... will update

Sub Torrent_data()

    Dim http As New XMLHTTP60, html As New HTMLDocument
    Dim movie_name As Object, movie As Object

    With http
        .Open "GET", "https://www.yify-torrent.org/search/1080p/", False
        .send
        html.body.innerHTML = .responseText
    End With

    Set movie_name = html.querySelectorAll("div.mv h3 a")

    Dim i As Integer
    For i = 0 To movie_name.Length - 1
        Cells(x + i, 1) = movie_name(i).innerText
    Next i

End Sub

Comments

1

looks like querySelectorAll has an issue of some sort

the object html.querySelectorAll(".mv h3 a") cannot be examined in Watch window.

attempting to do so crashes excel or word (i tried both)

tried other tags, same result

Sub Torrent_data()

    Dim http As New XMLHTTP60, html As New HTMLDocument
    Dim movie_name As Object, movie As Object

    With http
        .Open "GET", "https://www.yify-torrent.org/search/1080p/", False
        .send
        html.body.innerHTML = .responseText
    End With

'   Set movie_name = html.querySelectorAll("div.mv h3 a")   ' querySelectorAll crashes VBA when trying to examine movie_name object

    Set movie_name = html.getElementsByClassName("mv")      ' HTMLElementCollection

    For Each movie In movie_name
        x = x + 1: Cells(x, 1) = movie.getElementsByTagName("a")(1).innerText
    Next movie

'   HTML block for each movie looks like this

'   <div class="mv">
'       <h3>
'           <a href='/movie/55346/download-smoke-1995-1080p-mp4-yify-torrent.html' target="_blank" title="Smoke (1995) 1080p">Smoke (1995) 1080p</a>
'       </h3>
'       <div class="movie">
'           <div class="movie-image">
'               <a href="/movie/55346/download-smoke-1995-1080p-mp4-yify-torrent.html" target="_blank" title="Download Smoke (1995) 1080p">
'                   <span class="play"><span class="name">Smoke (1995) 1080p</span></span>
'                   <img src="//pic.yify-torrent.org/20170820/55346/smoke-1995-1080p-poster.jpg" alt="Smoke (1995) 1080p" />
'               </a>
'           </div>
'       </div>
'       <div class="mdif">
'           <ul>
'               <li><b>Genre:</b>Comedy</li><li><b>Quality:</b>1080p</li><li><b>Screen:</b>1920x1040</li><li><b>Size:</b>2.14G</li><li><b>Rating:</b>7.4/10</li><li><b>Peers:</b>2</li><li><b>Seeds:</b>0</li>
'           </ul>
'           <a href="/movie/55346/download-smoke-1995-1080p-mp4-yify-torrent.html" class="small button orange" target="_blank" title="Download Smoke (1995) 1080p YIFY Torrent">Download</a>
'       </div>
'   </div>

End Sub

Comments

0

I know this old, but I managed on how to use querySelectorAll without crashes my IE.

Instead of using For-each I used For Loop

Example below:

Dim priceData as Object
Set priceData = IE.document.getElementsByClassName("list-flights")(0).querySelectorAll("[class$='price']")


For i = 0 to priceData.Length - 1
    Debug.Print priceData.item(i).getElementsByClassName("cash js_linkInsideCell")(0).innerHTML
Next i 

Comments

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.