I'm currently merging 12 different data frames that are each 480,00 obs by an id and adding the columns, so it becomes a 48k obs x 14 variable data frame. However, this is taking too long to process and I'm looking for a faster way to do this.
Example
dput:
# January data
jan <- structure(list(gridNumber = c("17578", "18982", "18983", "18984",
"18985"), PRISM_ppt_stable_4kmM2_193301_bil = c(35.7099990844727,
36, 35.4199981689453, 33.7299995422363, 33.2799987792969)), .Names = c("gridNumber",
"PRISM_ppt_stable_4kmM2_193301_bil"), row.names = c("17578",
"18982", "18983", "18984", "18985"), class = "data.frame")
# February data
feb <- structure(list(gridNumber = c("17578", "18982", "18983", "18984",
"18985"), PRISM_ppt_stable_4kmM2_193302_bil = c(14.6199998855591,
14.5600004196167, 14.9899997711182, 15.4700002670288, 15.5799999237061
)), .Names = c("gridNumber", "PRISM_ppt_stable_4kmM2_193302_bil"
), row.names = c("17578", "18982", "18983", "18984", "18985"), class = "data.frame")
# March Data
mar <- structure(list(gridNumber = c("17578", "18982", "18983", "18984",
"18985"), PRISM_ppt_stable_4kmM2_193303_bil = c(23.8400001525879,
23.9200000762939, 24.3400001525879, 25.7900009155273, 26.5900001525879
)), .Names = c("gridNumber", "PRISM_ppt_stable_4kmM2_193303_bil"
), row.names = c("17578", "18982", "18983", "18984", "18985"), class = "data.frame")
dplyr Code:
library(dplyr)
datalist <- list(jan, feb, mar)
full <- Reduce(function(x,y) {full_join(x,y, by = "gridNumber")}, datalist)
This code obviously runs much faster because of the low obs, but is there a faster way to do this?