I am scraping tables from a website and have been scraping each web page one at a time but since the urls follow a pattern I am thinking of running the urls through a for loop.
I am trying to use the following script:
for(i in 1:38) {
webpage <- read_html(paste0("www.website.com/", i))
data <- webpage %>%
html_nodes("table") %>%
.[[1]] %>%
html_table()
}
My main issue is that the sites I am scraping do not follow a pattern I am able to put in the above for loop, but rather read as the following (if the /W wasn't included it would make it a lot easier): www.website.com/sample/test-01/W, www.website.com/sample/test-02/W, www.website.com/sample/test-03/W etc.
I feel as though there is an extremely simple way to place these into the above for loop but I am not sure of the syntax.
EDIT: one more issue is the 0 in the url www.website.com/sample/test-01/W. I can't paste the i after the 0 since the pattern goes 06-07-08-09-10-11 with the 0 not being valid after 09. And the website www.website.com/sample/test-012/W does not exist.
Wpart to thewebpagevariable?