0

I have a question how to write a loop in r which goes checks if a certain expression occurs in a string . So I want to check if the the expression “i-sty” occurs in my variable for each i between 1:200 and, if this is true, it should give the corresponding i.

For example if we have “4-sty” the loop should give me 4 and if there is no “i-sty” in the variable it should give me . for the observation.

I used

for (i in 1:200){
  datafram$height <- ifelse(grepl("i-sty", dataframe$Description), i, ".")
}

But it did not work. I literally only receive points. Attached I show a picture of the string variable. enter image description here

3
  • 2
    "i-sty" is just a string with the letter i in it. To you use a regex pattern with your variable i, you need to paste together a string, e.g., grepl(paste0(i, "-sty"), ...). I'd also recommend using NA rather than "." for the "else" result - that way the resulting height variable can be numeric. Commented Jun 1, 2020 at 14:05
  • Welcome to Stack Overflow! Please make your example reproducible, read How to Ask and stackoverflow.com/questions/5963269/… Commented Jun 1, 2020 at 14:19
  • x <- c("6-sty xxx", "4-sty yyyy", NA, "sty zzz", "32-sty xyz"); as.numeric(sub("^([0-9]+)-sty.*", "\\1", x)) Commented Jun 1, 2020 at 17:40

1 Answer 1

1

"i-sty" is just a string with the letter i in it. To you use a regex pattern with your variable i, you need to paste together a string, e.g., grepl(paste0(i, "-sty"), ...). I'd also recommend using NA rather than "." for the "else" result - that way the resulting height variable can be numeric.

for (i in 1:200){
  dataframe$height <- ifelse(grepl("i-sty", dataframe$Description), i, ".")
}

The above works syntactically, but not logically. You also have a problem that you are overwriting height each time through the loop - when i is 2, you erase the results from when i is 1, when i is 3, you erase the results from when i is 2... I think a better approach would be to extract the match, which is easy using stringr (but also possible in base). As a benefit, with the right pattern we can skip the loop entirely:

library(stringr)

dataframe$height = str_match(string = dataframe$Description, pattern = "[0-9]+-sty")[, 2]
# might want to wrap in `as.numeric`

You use both datafram and dataframe. I've assumed dataframe is correct.

Sign up to request clarification or add additional context in comments.

1 Comment

Thank you so much for your help. The code you proposed works very well with [, 1] for me.

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.