0

I have large data frame that looks something like this:

df1 = data.frame(A=c("A23", "A53", "B68"), B=c("Something-2030-002", "Something-4030-002",
                                               "Something-5030-002"))

I want to subset it to include only the observations with Something-X where X<5. That is:

df2 = data.frame(A=c("A23", "A53"), B=c("Something-2030-002", "Something-4030-002")

How can I do this with R?

Thanks

1 Answer 1

1

You can use sub to remove all the characters except the one digit following the first "-" and use that to create a logical index.

df1[sub('[^-]+-(.).*', '\\1', df1$B)<5,]
#    A                  B
#1 A23 Something-2030-002
#2 A53 Something-4030-002

Regex demo

  [^-]+-(.).*

Regular expression visualization

Debuggex Demo

Sign up to request clarification or add additional context in comments.

Comments

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.