2

I have an example dataframe that has a column which stores 3 letters of the alphabet per row. The dataframe also has 2 additional columns: date and colour:

Alphabet       Date   Colour
  ABC    2018-09-10   green
  DEF    2017-06-11   red
  GHI    2016-05-12   blue
  JKL            NA   yellow
  MNO            NA   orange
  PQR       Unknown   brown

Some of the dates are missing / unknown in this dataframe. I have another dataframe which also has a alphabet and a date column. This second dataframe includes the dates for the missing dates in the first dataframe:

Alphabet       Date   
  JKL    2017-06-07  
  MNO    2018-08-03   
  PQR    2019-10-07
  STU    2019-11-08
  VWX    2019-12-08   

I want to fill in the missing dates from the first dataframe by matching the Alphabet records from both dataframes and then insert the dates from the second dataframe into the first dataframe.

Desired output:

Alphabet       Date   Colour
  ABC    2018-09-10   green
  DEF    2017-06-11   red
  GHI    2016-05-12   blue
  JKL    2017-06-07   yellow
  MNO    2018-08-03   orange
  PQR    2019-10-07   brown

Any help is appreciated.

2 Answers 2

1

One option is a join with data.table

library(data.table)
setDT(df1)[df2, Date := i.Date, on = .(Alphabet)]
df1
#   Alphabet       Date Colour
#1:      ABC 2018-09-10  green
#2:      DEF 2017-06-11    red
#3:      GHI 2016-05-12   blue
#4:      JKL 2017-06-07 yellow
#5:      MNO 2018-08-03 orange
#6:      PQR 2019-10-07  brown

Update

Using the new 'df2n' dataset

i1 <- is.na(df1$Date)|df1$Date %in% "Unknown"
setDT(df1)[df2n[df2n$Alphabet %in% df1$Alphabet[i1],],
         Date := i.Date, on = .(Alphabet)]
df1
#   Alphabet       Date Colour
#1:      ABC 2018-09-10  green
#2:      DEF 2017-06-11    red
#3:      GHI 2016-05-12   blue
#4:      JKL 2017-06-07 yellow
#5:      MNO 2018-08-03 orange
#6:      PQR 2019-10-07  brown

Or using match from base R

i1 <- match(df2$Alphabet, df1$Alphabet)
df1$Date[i1] <- df2$Date

data

df1 <- structure(list(Alphabet = c("ABC", "DEF", "GHI", "JKL", "MNO", 
"PQR"), Date = c("2018-09-10", "2017-06-11", "2016-05-12", NA, 
NA, "Unknown"), Colour = c("green", "red", "blue", "yellow", 
"orange", "brown")), class = "data.frame", row.names = c(NA, 
-6L))

df2 <- structure(list(Alphabet = c("JKL", "MNO", "PQR"), Date = c("2017-06-07", 
"2018-08-03", "2019-10-07")), class = "data.frame", row.names = c(NA, 
-3L))

df2a  <- structure(list(Alphabet = c("JKL", "MNO", "PQR", "STU", "VWX"
), Date = c("2017-06-07", "2018-08-03", "2019-10-07", "2019-11-08", 
"2019-12-08")), class = "data.frame", row.names = c(NA, -5L))
Sign up to request clarification or add additional context in comments.

8 Comments

I'm getting this error message when I try your match method: NAs are not allowed in subscripted assignments. I don't believe I have NAs in my dataframes.
@KakSchoen I showed the data as well (which is a copy of your example) and it is working fine for me. Please check if you have factor columns. I assumed it as character class. Convert to character if needed df1[] <- lapply(df1, as.character); df2[] <- lapply(df2, as.character) and then apply
The example is also working for me, but my real dataset isn't. I checked my columns and they are all class characters, but I'm still getting the same error.
@KakSchoen Sorry, that is not reproducible for me
I'll try to clarify a bit further: I just tried your solution again but with a slight adjustment. I added a 4th row to df2, so now not every row from df2 matches with df1 (which is actually the case in my real dataset). This gives the the same error message that I was getting with my real dataset. So it seems that if the second dataframe has more rows that are not matching with the first dataframe, then the error occurs.
|
1

Using dplyr we can left_join df1 and df2 and then use coalesce to fill in missing values.

library(dplyr)

left_join(df1, df2, by = "Alphabet") %>%
   mutate(Date = coalesce(Date.y, Date.x)) %>%
  select(-Date.x, -Date.y)

#  Alphabet Colour       Date
#1      ABC  green 2018-09-10
#2      DEF    red 2017-06-11
#3      GHI   blue 2016-05-12
#4      JKL yellow 2017-06-07
#5      MNO orange 2018-08-03
#6      PQR  brown 2019-10-07

Comments

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.