1

I've searched extensively on stack overflow without finding an answer to the following question: Im looking for a function that lets me test if some columns contain any of the specified strings.

# I'm looking for heart attacks
infarction <- c("b", "c")
# I'm also looking for strokes
stroke <- c("h", "i")

#sample data set
set.seed(1234)
dat <- data.frame(A = sample(letters[1:9],10,TRUE),
                  B = sample(letters[1:9],10,TRUE),
                  C = sample(letters[1:9],10,TRUE),
                  D = sample(letters[1:9],10,TRUE),
                  DATE = sample.int(10, size = 10, replace = FALSE))

# I've tried many things. Among them:
# first one using the dplyr package.

infarction = ifelse( (infarction  %in% dat[,c("A", "B", "C", "D")]), DATE, NA))

#excluded a few rows from the mutate...

#I've also tried
grep(paste(infarction,collapse="|"), dat[,1:4], value=TRUE), DATE, NA))

To sum up. I can get it to work if i only check in one column or only look after one of the strings. But I need to check if any of the strings is contained in any of the given columns, and in that case return the date value as a new variable.

Any help is greatly appreciated.

2
  • What is the output you are looking for? An array of row numbers? A boolean? Commented Oct 10, 2014 at 11:56
  • Given that I'm aiming to create a new variable from this condition, I think a logical output would be preferred. Commented Oct 10, 2014 at 13:30

3 Answers 3

3

If you just want to know if the columns contain the letters, you can apply to each column of your data.frame the %in% function:

apply(dat[,-5], 2, '%in%', x=stroke)

If you want to know the position of each letter of stroke in your columns, you can use the which function and apply it to each column:

apply(dat[,-5], 2, function(table,x) which(x==table), x=stroke[1])
apply(dat[,-5], 2, function(table,x) which(x==table), x=stroke[2])

or

sapply(stroke, function(y) apply(dat, 2, function(table,x) which(x==table), x=y), simplify=FALSE)

If you want to know the position of all letters of stroke, do this:

apply(dat[,-5], 2, function(x) unlist(sapply(stroke , function(i) which( i == x))))
Sign up to request clarification or add additional context in comments.

1 Comment

Thanks Pop. I'm looking for an output where an additional variable is added and that variable is equal to the DATE variable for those rows with matching strings. I tried your code but I think I did something wrong; dat$infarction = ifelse((apply(dat[,-5], 2, '%in%', x=ami)), dat$DATE, NA)
2

It depends on what your desired output is. @Pop's answer is perfectly good I just want to add that your grep approach was on the right track except you can't give it a data frame in the x argument. You have to perform it column-wise. This is because grep will coerce the x argument to a character and you can see if you type as.character(dat[, 1:4]) that you are greping the wrong string. Instead, try for example

grepres <- lapply(dat[1:4], grep, pattern = paste(infarction,collapse="|"))

If you want to find which rows don't contain the pattern you can do for example

idx <- setdiff(1:nrow(dat), unlist(grepres))

and then

result <- rep(NA, nrow(dat))
result[idx] <- dat$DATE[idx]

Comments

1

Following may help:

> apply(dat, 1, function(x) ifelse(infarction %in% x[1:5],x[5],NA  ))
     [,1] [,2] [,3] [,4] [,5] [,6] [,7] [,8] [,9] [,10]
[1,] " 6" NA   " 3" NA   " 2" NA   " 7" NA   " 1" NA   
[2,] " 6" "10" " 3" NA   " 2" NA   " 7" " 9" NA   " 4" 

In data.frame form:

> as.data.frame(t(apply(dat, 1, function(x) ifelse(infarction %in% x[1:5],x[5],NA  ))))

     V1   V2
1     6    6
2  <NA>   10
3     3    3
4  <NA> <NA>
5     2    2
6  <NA> <NA>
7     7    7
8  <NA>    9
9     1 <NA>
10 <NA>    4

For stroke:

> as.data.frame(t(apply(dat, 1, function(x) ifelse(stroke %in% x[1:5],x[5],NA  ))))

     V1   V2
1  <NA> <NA>
2  <NA> <NA>
3  <NA> <NA>
4  <NA>    5
5     2 <NA>
6     8 <NA>
7  <NA> <NA>
8  <NA>    9
9     1    1
10    4 <NA>

Comments

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.