2

I have a directory with multiple *.pdf files. I want to find files with two or more keywords/strings in the filename.

In the below example I would like to search for a file which has "trials" and "neurology" in the filename. I used following code but no result.

    keyword1 = "trials"
    keyword2 = "neurology"
    pattern <- c("keyword1", "keyword2")
    whichfile <- grep(
    x = list.files("~/my_documents"),
        pattern = pattern,
        value = TRUE)
2
  • Are you looking for presence of both the keywords or a match of just one? Commented Feb 26, 2018 at 6:13
  • 1
    Where does the ampersand fit in? The title is confusing me ... Commented Feb 26, 2018 at 6:15

3 Answers 3

4

Here's a solution that accepts an arbitrary number of patterns. Even more, they can be arbitrary regular expressions.

allpatterns <- function(fnames, patterns) {
  i <- sapply(fnames, function(fn) all(sapply(patterns, grepl, fn)) )
  fnames[i]
}

filenames <- c("foo.txt", "bar.R", "foo_quux.py", "quux.c", "quux.foo",
               "foo_bar", "bar.foo.cpp", "foo_bar_quux", "quux_foo.bar", "nothing")
allpatterns(filenames, c("foo", "quux"))
# [1] "foo_quux.py"  "quux.foo"     "foo_bar_quux" "quux_foo.bar"
allpatterns(filenames, c("foo", "bar"))
# [1] "foo_bar"      "bar.foo.cpp"  "foo_bar_quux" "quux_foo.bar"
allpatterns(filenames, c("foo", "bar", "quux"))
# [1] "foo_bar_quux" "quux_foo.bar"
Sign up to request clarification or add additional context in comments.

Comments

2

Try this:

myfun=function(x){filenames = list.files(path = "~/my_documents/")
              boolvec = (grepl(pattern = x[1],x = filenames,fixed = TRUE) & grepl(pattern = x[2],x = filenames,fixed = TRUE))
              return(filenames[which(boolvec)])
}

myfun(c(keyword1,keyword2))

For checking n number of keywords:

keywords = c("trials","neurology","foo")
filenames = list.files(path = "~/my_documents/")
boolvec<-Reduce(function(x,y){x&y},Map(function(patt){grepl(pattern = patt,x = filenames,fixed = TRUE)},keywords))
filenames[boolvec]

Comments

0

this is modified to produce a result on my computer. you don't need grep() and your code is searching your path for files containing the strings "keyword1" and "keyword2".

keyword1 = "quote"
keyword2 = "Project"
pattern <- c(keyword1, keyword2)
whichfile<-vector()
for (i in pattern) {
whichfile <-c(whichfile,list.files(getwd(),pattern = i))
}
whichfile

[1] "quoteCloud.jpeg"         "quotecloud2.jpeg"        "quotecloud3.jpeg"        "quotecloud4.jpeg"       
[5] "Valentinequote.R"        "Project 2 Markdown.Rmd"  "Project 2 Script.R"      "Project_2_Markdown.html"
[9] "Project_4 rough.R"       "Project3.html"           "Project3.Rmd" hope this helps you. 

3 Comments

I think you should correct your keyword2. Also, I think OP is looking for occurrence of both keywords in the filenames.
@TUSHAr I was testing something myself, since my working directory is has a bunch of randomly named files. but I will edit it to show it works with both words.
@TUSHAr corrected to reflect both. Thank you for the feedback.

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.