0

I am creating a macro to scrape data from a website. The problem that I am having is when the last full page is scraped and column A has no data, but the other columns do, i receive a run time 1004 error. For example, if the total pages to be scraped is 6, and column A has no data on the last entry on page 5, the macro will scrape all of the data on page 5, but then throws the run time error when trying to get to page 6. There is also data on page 6, but I am thinking that since there is no data in column A, it just decides to give the run time error. Any ideas on this? Also, with the code that I am including, would it be easier to have the macro loop until the next arrow is gone? If so, how would I go about's in doing so?

'Macro to query Daily Activity Search for DFB Counties
'Run Monday to pull data from Friday

Sub queryActivityDailyMforFWorking()

Dim nextrow As Integer, i As Integer
Dim dates
dates = Date - 3

Application.ScreenUpdating = False
Application.DisplayStatusBar = True

Do While i <= 50
    Application.StatusBar = "Processing Page " & i
    nextrow = ActiveSheet.Cells(Rows.Count, "A").End(xlUp).row + 1
    With ActiveSheet.QueryTables.Add(Connection:= _
        "URL;https://www.myfloridalicense.com/delinquency_results.asp?SID=&page=" & i & "&county_1=16&county_1=21&county_1=23&county_1=32&county_1=36&county_1=41&county_1=46&county_1=53&county_1=54&county_1=57&county_1=60&county_1=66&status=R&send_date=" & dates & "&search_1.x=1", _
        Destination:=Range("A" & nextrow))

        '.Name = _
        "2015&search_1.x=40&search_1.y=11&date=on&county_1=AL&lic_num_del=&lic_num_rep=&status=NS&biz_name=&owner_name="
        .FieldNames = False
        .RowNumbers = False
        .FillAdjacentFormulas = False
        .PreserveFormatting = True
        .RefreshOnFileOpen = False
        .BackgroundQuery = True
        .RefreshStyle = xlInsertDeleteCells
        .SavePassword = False
        .SaveData = True
        .AdjustColumnWidth = True
        .RefreshPeriod = 0
        .WebSelectionType = xlSpecifiedTables
        .WebFormatting = xlWebFormattingNone
        .WebTables = "10"
        .WebPreFormattedTextToColumns = True
        .WebConsecutiveDelimitersAsOne = True
        .WebSingleBlockTextImport = False
        .WebDisableDateRecognition = False
        .WebDisableRedirections = False
        .Refresh BackgroundQuery:=False

    'autofit columns
    Columns("A:G").Select
    Selection.EntireColumn.AutoFit

   'check for filter, if not then turn on filter
   ActiveSheet.AutoFilterMode = False
    If Not ActiveSheet.AutoFilterMode Then
    ActiveSheet.Range("A:G").AutoFilter
    End If

i = i + 1
End With
Application.StatusBar = False

'Align text left
Cells.Select
With Selection
    .HorizontalAlignment = xlLeft
    .VerticalAlignment = xlBottom
    .WrapText = False
    .Orientation = 0
    .AddIndent = False
    .IndentLevel = 0
    .ShrinkToFit = False
    .ReadingOrder = xlContext
    .MergeCells = False
End With

Loop

End Sub
5
  • When I select debug, it shows that the error is coming on .Refresh BackgroundQuery :=False Commented Jan 11, 2016 at 15:45
  • is it setting as backgroundquery true, but refreshing as false? Commented Jan 11, 2016 at 15:55
  • Yes, I do believe so... Commented Jan 11, 2016 at 16:18
  • 1
    Is there a page zero? You declare i but never assign it a value so in the first loop i is 0. Commented Jan 11, 2016 at 17:05
  • @Jeeped no there is no page zero. I forgot to include that portion into the code. Even when i declare i with the value of 1, i am still receiving the same error message Commented Jan 11, 2016 at 17:19

1 Answer 1

1

I wasn't able to replicate your error, but I would guess that it had to do with your nextrow variable. If the data on a page ended with an empty cell then the value of nextrow for the next page of data would be set inside the previous page's data. I would think that would cause some issues when you add another query table and then try to refresh the data as the tables would be overlapping. You could get around this by getting the bottom row of one of the other columns if you know of one that will always have data for every row. I made some updates and it seems to work pretty well for me:

  • Added error handling
  • Check columns A and B for the bottom row of data
  • Added some logic to check if a full page was returned and if not to exit the loop so you don't have to keep parsing empty pages
  • Formatted the date in the connection string as I've found that to cause issues in the past
  • Added the option to get rid of the headers if you don't want them
  • Moved the cell formatting out of the loop so it is only executed once

Hope this helps.

Sub queryActivityDailyMforFWorking()
On Error GoTo Err_queryActivityDailyMforFWorking

Const RowsPerPage As Byte = 20
Const DeleteHeader As Boolean = True

Dim nextrow As Integer, maxrow As Integer, i As Integer
Dim dates As Date

dates = Date - 3

Application.ScreenUpdating = False
Application.DisplayStatusBar = True

nextrow = 1
For i = 1 To 50
    Application.StatusBar = "Processing Page " & i
    With ActiveSheet.QueryTables.Add(Connection:= _
        "URL;https://www.myfloridalicense.com/delinquency_results.asp?SID=&page=" & i & "&county_1=16&county_1=21&county_1=23&county_1=32&county_1=36&county_1=41&county_1=46&county_1=53&county_1=54&county_1=57&county_1=60&county_1=66&status=R&send_date=" & Format(dates, "m/d/yyyy") & "&search_1.x=1", _
        Destination:=Range("A" & nextrow))
        '.Name = _
        "2015&search_1.x=40&search_1.y=11&date=on&county_1=AL&lic_num_del=&lic_num_rep=&status=NS&biz_name=&owner_name="
        .FieldNames = False
        .RowNumbers = False
        .FillAdjacentFormulas = False
        .PreserveFormatting = True
        .RefreshOnFileOpen = False
        .BackgroundQuery = True
        .RefreshStyle = xlInsertDeleteCells
        .SavePassword = False
        .SaveData = True
        .AdjustColumnWidth = True
        .RefreshPeriod = 0
        .WebSelectionType = xlSpecifiedTables
        .WebFormatting = xlWebFormattingNone
        .WebTables = "10"
        .WebPreFormattedTextToColumns = True
        .WebConsecutiveDelimitersAsOne = True
        .WebSingleBlockTextImport = False
        .WebDisableDateRecognition = False
        .WebDisableRedirections = False
        .Refresh BackgroundQuery:=False
    End With

    ' Delete the header as required
    If DeleteHeader And i > 1 And ActiveSheet.Cells(nextrow, 1).Value = "License" Then ActiveSheet.Cells(nextrow, 1).EntireRow.Delete

    ' Find the bottom row
    maxrow = Application.WorksheetFunction.Max(ActiveSheet.Cells(Rows.Count, 1).End(xlUp).Row, ActiveSheet.Cells(Rows.Count, 2).End(xlUp).Row)
    ' Stop scraping if a full page wasn't returned
    If (maxrow - nextrow) < (RowsPerPage - IIf(DeleteHeader, 1, 0)) Then
        Exit For
    ' Otherwise set the row for the next page of data
    Else
        nextrow = maxrow + 1
    End If
Next i

Application.StatusBar = "Formatting data"

'autofit columns
ActiveSheet.Columns.EntireColumn.AutoFit

'check for filter, if not then turn on filter
ActiveSheet.AutoFilterMode = False
If Not ActiveSheet.AutoFilterMode Then ActiveSheet.Range("A:G").AutoFilter

'Align text left
With ActiveSheet.Cells
    .HorizontalAlignment = xlLeft
    .VerticalAlignment = xlBottom
    .WrapText = False
    .Orientation = 0
    .AddIndent = False
    .IndentLevel = 0
    .ShrinkToFit = False
    .ReadingOrder = xlContext
    .MergeCells = False
End With

Exit_queryActivityDailyMforFWorking:
    Application.StatusBar = False
    Application.ScreenUpdating = True
    Exit Sub

Err_queryActivityDailyMforFWorking:
    MsgBox Err.Description, vbCritical + vbOKOnly, Err.Number & " - Web Scraping Error"
    Resume Exit_queryActivityDailyMforFWorking

End Sub
Sign up to request clarification or add additional context in comments.

1 Comment

This work the way that I have intended on the code to work. Thank you very much for your help! :) Finding the bottom row was something that I was trying to figure out how to code for the longest time. I figured that would be the easiest way to tell the code to stop running once it found the last row of data, but could not figure out how to code it. Your answer has definitely helped me out a whole bunch. Thank you again

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.