3

Hi I want to add rows from one dataframe to another dataframe using R.

I have one dataframe DATA1 which has some missing ID and Data2 has all the ID's I want to replace the DATA2 frequency column with DATA1 values for all the matching ID's. and I want OUTPUT dataframe as my Output.

Data1
ID  frequency
1   1
2   7
3   11
5   4

DATA2
ID  frequency
1   0
2   0
3   0
4   0
5   0
6   0

Output
ID  frequency
1   1
2   7
3   11
4   0
5   4
6   0
3
  • 2
    I would use DATA2[match(Data1$ID, DATA2$ID), 'frequency'] <- Data1$frequency Commented Mar 17, 2017 at 2:17
  • Along the right lines, but that will replace the values in DATA2, not create a new Output. Commented Mar 17, 2017 at 2:31
  • Does DATA2 always contain only zeroes, or can it contain other values that you may not want to replace? Commented Mar 17, 2017 at 2:45

3 Answers 3

2

If data ID is unique, I think can use ID to be rownames.

data1 <- data.frame(
  freq = c(1, 7, 11, 4),
  row.names = c(1, 2, 3, 5)
)

data2 <- data.frame(
  freq = rep(0,6),
  row.names = seq(1, 6)
)
output <- data2
apply(
  matrix(rownames(data1), ncol=1),
  1,
  function(x){
    output[x, 1] <<- data1[x, 1];
    return(NULL)
  }
)

And the result is :

> output
  freq
1    1
2    7
3   11
4    0
5    4
6    0
Sign up to request clarification or add additional context in comments.

1 Comment

@RomilGarg My pleasure.
1

You could do a join with data.table.

library(data.table)
## set both data frames to data tables
setDT(Data1); setDT(Data2)
## copy 'Data2' to a new table 'Output' which we will assign values to
Output <- copy(Data2)
## join on 'ID' and assign by reference the relevant 'frequency' values
Output[Data1, frequency := i.frequency, on = "ID"]
Output
#    ID frequency
# 1:  1         1
# 2:  2         7
# 3:  3        11
# 4:  4         0
# 5:  5         4
# 6:  6         0

Original data:

Data1 <- structure(list(ID = c(1L, 2L, 3L, 5L), frequency = c(1L, 7L, 
11L, 4L)), .Names = c("ID", "frequency"), class = "data.frame", row.names = c(NA, 
-4L))

Data2 <- structure(list(ID = 1:6, frequency = c(0L, 0L, 0L, 0L, 0L, 0L
)), .Names = c("ID", "frequency"), class = "data.frame", row.names = c(NA, 
-6L))

Comments

1

I'm sure there's an elegant single-line solution, but the dplyr way is to join the data frames by ID and then tidy up the output.

library(dplyr)
OUTPUT <- Data1 %>% 
            right_join(DATA2, by = "ID") %>%
            mutate(frequency = ifelse(is.na(frequency.x), frequency.y, frequency.x)) %>%
            select(ID, frequency)

1 Comment

No, the ifelse is required. It replaces NA values in frequency.x with the corresponding value from frequency.y. Just run the right join without the later steps and you'll see why.

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.