2

I have this dataframe (DF1)

structure(list(ID = 1:3, Text = c("there was not clostridium", "clostridium difficile positive", "test was OK")), class = "data.frame", row.names = c(NA, -3L)) 

ID TEXT
1  "there was not clostridium"
2  "clostridium difficile positive"
3  "test was OK"

and dataframe (DF2)

structure(list(ID = 1:3, Microorganisms = c("ESCHERICHIA COLI", "CLOSTRIDIUM DIFFICILE", "FUNGI")), class = "data.frame", row.names = c(NA, -3L))

ID Microorganisms
1  ESCHERICHIA COLI
2  CLOSTRIDIUM DIFFICILE
3  FUNGI

And I would like with regex find matches DF1 and DF2 and put them to a new column like this

ID TEXT                                Microorganism
1  "there was not clostridium"         CLOSTRIDIUM DIFFICILE
2  "clostridium difficile positive"    CLOSTRIDIUM DIFFICILE
3  "test was OK"                       no

I have tried something like this

DF1 %>% mutate(Mikroorganism = ifelse(grepl(DF2$Microorganisms, TEXT), str_extract(TEXT, DF2$Microorganisms), "no"))

But it was not the way.

2
  • A simple regex is not going to work with your first row: there is no "difficile". Are you looking for a match of any of the words in DF2, not the string as a whole? Commented Feb 2, 2021 at 13:35
  • Yes, I would like to match of any of the words in DF2. Is it possible? Commented Feb 2, 2021 at 13:36

1 Answer 1

4

One way is using the fuzzyjoin package.

DF1 %>%
  fuzzyjoin::regex_left_join(
    transmute(DF2, Microorganisms, ptn = gsub("\\s+", "|", Microorganisms)),
    by = c("Text" = "ptn"), ignore_case = TRUE) %>%
  select(-ptn)
#   ID                           Text        Microorganisms
# 1  1      there was not clostridium CLOSTRIDIUM DIFFICILE
# 2  2 clostridium difficile positive CLOSTRIDIUM DIFFICILE
# 3  3                    test was OK                  <NA>
Sign up to request clarification or add additional context in comments.

1 Comment

Will it work when in Text column in DF 1 is more than one string? I mean instead of just "there was not clostridium" will be c("there was not clostridium", "some text", "some text)

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.