I have a dataset on a Git Hub page. I imported them to Rstudio as a CSV file and created an array of URLs called "StoryLink" Now I want to web scrape data from each of these web pages. So I created a for loop and assign all of the collected data to a variable called "articleText" and converted it to a character array called "ArticlePage"
My problem is that even though I created a for loop it only web scrape the last web page (6th article) on the list of URLs. how do I web scrape all the URLs?
library(rvest)
library(dplyr)
GitHubpoliticsconversions<- "https://raw.githubusercontent.com/lukanius007/web_scraping_politics/main/politics_conversions.csv"
CSVFile <- read.csv(GitHubpoliticsconversions, header = TRUE, sep = ",")
StoryLink <- c(pull(CSVFile, 4))
page <- {}
for(i in 1:6){
page[i] <- c(StoryLink[i])
ArticlePage <- read_html(page[i])
articleText = ArticlePage %>% html_elements(".lead , .article__title") %>% html_text()
PoliticalArticles <- c(articleText)
}
This is the result I got from this code but I need the same from all web pages
>PoliticalArticles
[1] "Wie es zur Hausdurchsuchung bei Finanzminister Blümel kam"
[2] "Die Novomatic hatte den heutigen Finanzminister 2017 um Hilfe bei Problemen im Ausland gebeten – und eine Spende für die ÖVP angeboten. Eine solche habe er nicht angenommen, sagt Blümel."
>