2

I have a data frame which is configured roughly like this:

df <- cbind(c('hello', 'yes', 'example'),c(7,8,5),c(0,0,0))
words frequency count
hello 7 0
yes 8 0
example 5 0

What I'm trying to do is add values to the third column from a different data frame, which is similiar but looks like this:

df2 <- cbind(c('example','hello') ,c(5,6))
words frequency
example 5
hello 6

My goal is to find matching values for the first column in both data frames (they have the same column name) and add matching values from the second data frame to the third column of the first data frame.

The result should look like this:

df <- cbind(c('hello', 'yes', 'example'),c(7,8,5),c(6,0,5))
words frequency count
hello 7 6
yes 8 0
example 5 5

What I've tried so far is:

df <- merge(df,df2, by = "words", all.x=TRUE) 

However, it doesn't work.

I could use some help understanding how could it be done. Any help will be welcome.

2
  • You can change the column name of df2 freq to count and then you can left join by words Commented Jun 30, 2022 at 2:31
  • 1
    cbind doesn't create data frames, it creates matrices. Use data.frame instead of cbind, and put the column names in there too, for the example to make sense. Commented Jun 30, 2022 at 2:44

2 Answers 2

1

This is an "update join". My favorite way to do it is in dplyr:

library(dplyr)
df %>% rows_update(rename(df2, count = frequency), by = "words")

In base R you could do the same thing like this:

names(df2)[2] = "count2"
df = merge(df, df2, by = "words", all.x=TRUE)
df$count = ifelse(is.na(df$coutn2), df$count, df$count2)
df$count2 = NULL
Sign up to request clarification or add additional context in comments.

Comments

1

Here is an option with data.table:

library(data.table)

setDT(df)[setDT(df2), on = "words", count := i.frequency]

Output

     words frequency count
    <char>     <num> <num>
1:   hello         7     6
2:     yes         8     0
3: example         5     5

Or using match in base R:

df$count[match(df2$words, df$words)] <- df2$frequency

Or another option with tidyverse using left_join and coalesce:

library(tidyverse)

left_join(df, df2 %>% rename(count.y = frequency), by = "words") %>%
  mutate(count = pmax(count.y, count, na.rm = T)) %>%
  select(-count.y)

Data

df <- structure(list(words = c("hello", "yes", "example"), frequency = c(7, 
8, 5), count = c(0, 0, 0)), class = "data.frame", row.names = c(NA, 
-3L))

df2 <- structure(list(words = c("example", "hello"), frequency = c(5, 6)), class = "data.frame", row.names = c(NA, 
-2L))

Comments

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.