1

I need to 'merge' two different data.frames with one another of unequal size but with the same unique identifier (ID) and I want to retain the # of rows of the larger data.frame.

More importantly, I want the value of variable x in data.frame.1 (the larger one) to be summed for each unique ID such that in data.frame.3 (the merged dataset) each observation for variable x is the sum of the observations with the same unique identifier originally found in data.frame.1.

Essentially, I want my merged dataset to have the row dimensions of my smaller dataset (data.frame.2) -i.e. same # of observations -but I want the column from the larger df (data.frame.1) merged to the column of the smaller df (data.frame.2) and I want its values aggregated like stated above (sum).

I hope this is clear so the charts below make it more clear: there are three total Unique ID's (a,b,c) but in data.frame.1 these repeated -i want these repeated values summed when the merger takes place.

ID x data.frame.1
a 1 
a 8 
a 10 
b 2 
b 1 
c 4 

ID y data.frame.2
a 3 
b 7 
c 9 

ID y x data.frame.3
a 3 19
b 7 3
c 9 4

1 Answer 1

1
data.frame1 <- data.frame(ID = c(rep("a",3), rep("b",2), "c"),
                         x = c(1,8,10,2,1,4))
data.frame2 <- data.frame(ID = c("a", "b", "c"),
                         y = c(3, 7, 9))

data.frame1 <- aggregate(x ~ ID, data.frame1, sum)
data.frame3 <- merge(data.frame2, data.frame1, by = "ID")
Sign up to request clarification or add additional context in comments.

Comments

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.