1

I have two dataframes (DF1 and DF2)

DF1 <- as.data.frame(c("A, B","C","A","C, D"))
names(DF1) <- c("parties")

DF1

 parties
    A, B
       C
       A
    C, D

.

B <- as.data.frame(c(LETTERS[1:10]))
C <- as.data.frame(1:10)
DF2 <- bind_cols(B,C)
names(DF2) <- c("party","party.number")

. DF2

   party party.number
      A            1
      B            2
      C            3
      D            4
      E            5
      F            6
      G            7
      H            8
      I            9
      J           10

The desired result should be an additional column in DF1 which contains the party numbers taken from DF2 for each row in DF1.

Desired result (based on DF1):

  parties party.numbers
    A, B          1, 2
       C             3
       A             1
    C, D          3, 4

I strongly suspect that the answer involves something like str_match(DF1$parties, DF2$party.number) or a similar regular expression, but I can't figure out how to put two (or more) party numbers into the same row (DF2$party.numbers).

2 Answers 2

1

One option is gsubfn by matching the pattern as upper-case letter, as replacement use a key/value list

library(gsubfn)
DF1$party.numbers <- gsubfn("[A-Z]", setNames(as.list(DF2$party.number), 
           DF2$party), as.character(DF1$parties))
DF1
#   parties party.numbers
#1    A, B          1, 2
#2       C             3
#3       A             1
#4    C, D          3, 4
Sign up to request clarification or add additional context in comments.

Comments

1

An alternative solution using tidyverse. You can reshape DF1 to have one string per row, then join DF2 and then reshape back to your initial form:

library(tidyverse)

DF1 <- as.data.frame(c("A, B","C","A","C, D"))
names(DF1) <- c("parties")

B <- as.data.frame(c(LETTERS[1:10]))
C <- as.data.frame(1:10)
DF2 <- bind_cols(B,C)
names(DF2) <- c("party","party.number")


DF1 %>%
  group_by(id = row_number()) %>%
  separate_rows(parties) %>%
  left_join(DF2, by=c("parties"="party")) %>%
  summarise(parties = paste(parties, collapse = ", "),
            party.numbers = paste(party.number, collapse = ", ")) %>%
  select(-id)

# # A tibble: 4 x 2
#   parties party.numbers
#   <chr>   <chr>        
# 1 A, B    1, 2         
# 2 C       3            
# 3 A       1            
# 4 C, D    3, 4 

Comments

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.