29

I have a data frame "z"

   letter color
1       a     0
2       e     0
3       b     0
4       b     0
5       d     0
6       d     0
7       a     0
8       b     0
9       c     0
10      d     0
11      c     0
12      c     0
13      c     0
14      c     0
15      e     0
16      e     0
17      a     0
18      d     0
19      e     0
20      b     0

and another data frame "y"

  letter color
1      a   red
2      b  blue
3      c green

when the letter in z matches with a letter in y I would like to append the color from y into the corresponding color field in z but I do not want to remove any values from z. If a match doesn't occur, z$color should remain unchanged. I used"0" as a place holder in z$color, this could be text instead.

I've been attempting things for loops, the match() command and statements with %in% but I'm not quite achieving the results I'm after.

Any ideas?

This is the code I used for the data frames

set.seed(3)
z=data.frame(sample(c("a","b","c","d","e"),20,replace=T))
names(z)="letter"
z$color=rep(0,dim(z)[1])
z

y1=c("a","b","c")
y2=c("red","blue","green")
y=data.frame(cbind(y1,y2))
names(y)=c("letter","color")
y
1
  • With match, I guess it should be something like y$color[match(z$letter, y$letter)] Commented Feb 11, 2014 at 20:38

3 Answers 3

43

you don't need z$color in the first place if its just place holder, you can replace NA later with 0

z$color<-y[match(z$letter, y$letter),2]
Sign up to request clarification or add additional context in comments.

1 Comment

I included the z$color because I was once told that it isn't efficient for R to populate an empty vector. Basically define a vector for the output up front. I can drop it no problem. This toy example will eventually be expanded to a much larger data frame. Your response was very helpful. Very close to what I've been attempting for the last few hours but never hit on. Thank you!
8

You can use merge:

dat <- merge(z, y, by = "letter", all.x = TRUE)
transform(dat, color = ifelse(is.na(color.y), 
                              color.x, as.character(color.y)))[-(2:3)]

   letter color
1       a   red
2       a   red
3       a   red
4       b  blue
5       b  blue
6       b  blue
7       b  blue
8       c green
9       c green
10      c green
11      c green
12      c green
13      d     0
14      d     0
15      d     0
16      d     0
17      e     0
18      e     0
19      e     0
20      e     0

1 Comment

@WillPhillips No, the NAs are replaced by replace. The function transform is used to add a new column to the data frame.
4

sqldf/sqlite is very flexible:

library(sqldf)
z$color="0" # to avoid conflicts numeric/characters
z <- sqldf(c("UPDATE z
             SET color = (SELECT y.color
                          FROM y
                          WHERE z.letter = y.letter
                           )
             WHERE EXISTS (SELECT 1
                           FROM y
                           WHERE z.letter = y.letter
                           )"
             , "select * from main.z"
                  )
           )
z
   letter color
1       b  blue
2       a   red
3       d   0.0
4       d   0.0
5       e   0.0
6       a   red
7       a   red
8       c green
9       b  blue
10      c green
11      e   0.0
12      c green
13      b  blue
14      d   0.0
15      d   0.0
16      d   0.0
17      c green
18      e   0.0
19      a   red
20      c green

Comments

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.