Subset dataframe using number in string?

Question

I have large data frame that looks something like this:

df1 = data.frame(A=c("A23", "A53", "B68"), B=c("Something-2030-002", "Something-4030-002",
                                               "Something-5030-002"))

I want to subset it to include only the observations with Something-X where X<5. That is:

df2 = data.frame(A=c("A23", "A53"), B=c("Something-2030-002", "Something-4030-002")

How can I do this with R?

Thanks

akrun · Accepted Answer · 2015-01-27 17:26:20Z

1

You can use sub to remove all the characters except the one digit following the first "-" and use that to create a logical index.

df1[sub('[^-]+-(.).*', '\\1', df1$B)<5,]
#    A                  B
#1 A23 Something-2030-002
#2 A53 Something-4030-002

Regex demo

  [^-]+-(.).*

Regular expression visualization

Debuggex Demo

answered Jan 27, 2015 at 17:26

akrun

891k38 gold badges590 silver badges700 bronze badges

Sign up to request clarification or add additional context in comments.

Collectives™ on Stack Overflow

Subset dataframe using number in string?

1 Answer 1

Comments

Your Answer

Hot Network Questions

Collectives™ on Stack Overflow

1 Answer 1

Comments

Your Answer

Sign up or log in

Post as a guest

Related