R - Populate one data frame with values from another dataframe, based on row matching

Question

I'm trying to replace values in myDF1 from myDF2, where rows match for column "studyno" but the solutions I have found so far don't seem to be giving me the desired output.

Below are the data.frames:

myDF1 <- structure(list(studyno = c("J1000/9", "J1000/9", "J1000/9", "J1000/9", 
"J1000/9", "J1000/9"), date = structure(c(17123, 17127, 17135, 
17144, 17148, 17155), class = "Date"), pf_mcl = c(NA_integer_, 
NA_integer_, NA_integer_, NA_integer_, NA_integer_, NA_integer_
), year = c(2016, 2016, 2016, 2016, 2016, 2016)), .Names = c("studyno", 
"date", "pf_mcl", "year"), row.names = c(NA, 6L), class = "data.frame")

myDF2 <- structure(list(studyno = c("J740/4", "J1000/9", "J895/7", "J931/6", 
"J609/1", "J941/3"), pf_mcl = c(0L, 0L, 0L, 0L, 0L, 0L)), .Names = c("studyno", 
"pf_mcl"), row.names = c(NA, 6L), class = "data.frame")

One solution I tried that seemed to work is shown below, however, I find that whatever values were in myDF1 before have been removed.

myDF1$pf_mcl <- myDF2$pf_mcl[match(myDF1$studyno, myDF2$studyno)]

Can you clarify the output you want, & how your proposed solution differs? It seems to me that if you want to "replace values in myDF1 from myDF2", then the "values [that] were in myDF1 before" should "have been removed", so I think I'm missing something. — gung - Reinstate Monica
– gung - Reinstate Monica, Commented Oct 13, 2017 at 16:43
Hi @gung, sorry for not being clear. myDF2 is a subset of myDF1, however, myDF2 is better curated that myDF1. For that reason, I have found some rows in myDF1 have missing values and I am therefore looking for a match in myDF2 and updating those values in myDF1. However, I don't want to loose the values in rows that don't match, which is what the script I posted was doing. Let me know if I need to add more detail. — K. Wamae
– K. Wamae, Commented Oct 13, 2017 at 16:55
Hi, @Kelli-Jean, an example please. I have seen some solutions with the merge function and still wasn't getting the right output. — K. Wamae
– K. Wamae, Commented Oct 13, 2017 at 16:57

Kelli-Jean · Accepted Answer · 2017-10-13 22:05:25Z

1

# Merge myDF1 & myDF2 by the "studyno", keeping all the rows in myDF1
agg_df = merge(myDF1, myDF2, "studyno", all.x=TRUE)
# Populate pf_mcl in the merged dataframe by using pf_mcl in myDF2 if it is available. Otherwise, use pf_mcl from myDF1
# is missing in myDF1
agg_df$pf_mcl = ifelse(is.na(agg_df$pf_mcl.y), agg_df$pf_mcl.x, agg_df$pf_mcl.y)
myDF1 = agg_df[, names(myDF1)]

edited Oct 13, 2017 at 22:05

answered Oct 13, 2017 at 17:21

Kelli-Jean

1,44712 silver badges17 bronze badges

Sign up to request clarification or add additional context in comments.

3 Comments

K. Wamae Over a year ago

Hi @Kelli-Jean, thanks for the solution, pardon my explanation...let me elaborate further. As I mentioned earlier, myDF2 is a well curated subset of myDF1. Therefore, some rows in the two datasets match based on "studyno", you may find that values are missing in myDF1$pf_mcl or the values are wrong. All I want to do is identify a matching row in myDF2 and populate myDF1$pf_mcl with the value in myDF2$pf_mcl. If a row does not match, the value should remain the same. I don't know whether it's worth mentioning, the two data frames have other columns...I have selected a few for example purposes

Kelli-Jean Over a year ago

@K.Wamae I updated my answer. If this is still not the answer you are expecting, can you provide a data set that has records where your solution is not working? And the expected output. Thanks!

K. Wamae Over a year ago

Dear @Kelli-Jean, I have tested it and it works perfectly. Thank you big time for the solution...

Collectives™ on Stack Overflow

R - Populate one data frame with values from another dataframe, based on row matching

1 Answer 1

3 Comments

Your Answer

Hot Network Questions

Collectives™ on Stack Overflow

1 Answer 1

3 Comments

Your Answer

Sign up or log in

Post as a guest

Related