I am doing web scraping of a website. When I fetch the data from a website every page has 10 observations. I am writing a function where you can specify no of pages to scrape and finally store it in a list and later convert it into dataframe.
library(jsonlite)
forum_data_fetch <- function(no_of_pages) {
pages <- seq(no_of_pages)
#print(pages)
forum_data <- list()
for(i in 1:length(pages)){
tmp <- fromJSON(paste("http://mmb.moneycontrol.com/index.php?q=topic/ajax_call§ion=get_messages&offset=&lmid=&isp=0&gmt=cat_lm&catid=1&pgno=",i,sep=""))
forum_data[[i]] <- tmp
}
dat <- as.data.frame(forum_data)
dat <- dat[,c("msg_id","border_msg_count","user_id","border_level_text","follower_count", "topic", "tp_sector","tp_msg_count","heading", "flag", "price", "message")]
return(dat)
}
test <- forum_data_fetch(3)
Ideally, the above function returns 30 observations, but it returns only 10. I think I am doing something wrong while storing the list as a data.frame
dat <- as.data.frame(forum_data), replace it withdat <- lapply(forum_data, as.data.frame) %>% rbindlist(data.tableanddplyrpackages are used).