Extract string between /

Question

If I have these strings:

mystrings <- c("X2/D2/F4",
               "X10/D9/F4",
               "X3/D22/F4",
               "X9/D22/F9")

How can I extract 2,9,22,22. These characters are between the / and after the first character within the /.

I would like to do this in a vectorized fashion and add the new column with transfrom if possible with which I am familiar.

I think this regex gets me somewhere near all the characters within \:

^.*\\'(.*)'\\.*$

+1 for all @Arun gave me the first workable answer. I just don't work with strings enough. — user1320502
– user1320502, Commented Jan 3, 2013 at 20:24

IRTFM · Accepted Answer · 2016-12-05 18:34:05Z

29

> gsub("(^.+/[A-Z]+)(\\d+)(/.+$)", "\\2", mystrings)
[1] "2"  "9"  "22" "22"

You would "read" (or "parse") that regex pattern as splitting any matched string into three parts:

1) anything up to and including the first forward slash followed by a sequence of capital letters,

2) any digits(= "\d") in a sequence before the next slash and ,

3) from the next slash to the end.

And then only returning the second part....

Non-matched character strings would be returned unaltered.

edited Dec 5, 2016 at 18:34

answered Jan 3, 2013 at 20:14

IRTFM

264k22 gold badges381 silver badges503 bronze badges

Sign up to request clarification or add additional context in comments.

1 Comment

Justin Over a year ago

+1 I did not know you could grab the second set of matches with \\2 without a second group! Slick.

Arun · Accepted Answer · 2013-01-03 20:11:15Z

20

as.numeric(gsub("^.*D([0-9]+).*$", "\\1", mystrings))

answered Jan 3, 2013 at 20:11

Arun

119k28 gold badges290 silver badges396 bronze badges

Comments

Roman Luštrik · Accepted Answer · 2013-01-03 20:18:45Z

8

@Arun stole my thunder, so I'm giving my initial long-winded example.

cut.to.pieces <- strsplit(mystrings, split = "/")
got.second <- lapply(cut.to.pieces, "[", 2)
get.numbers <- unlist(got.second)
as.numeric(gsub(pattern = "[[:alpha:]]", replacement = "", x = get.numbers, perl = TRUE))
[1]  2  9 22 22

answered Jan 3, 2013 at 20:18

Roman Luštrik

70.9k25 gold badges160 silver badges203 bronze badges

Comments

Matthew Plourde · Accepted Answer · 2013-01-03 20:22:09Z

8

Using str_extract from the stringr package:

as.numeric(str_extract(mystrings, perl('(?<=/[A-Z])[0-9]+(?=/)')))

edited Jan 3, 2013 at 20:22

answered Jan 3, 2013 at 20:13

Matthew Plourde

44.8k9 gold badges103 silver badges116 bronze badges

1 Comment

Matthew Plourde Over a year ago

@rrs It's part of a look-behind assertion. type ?regex in the R prompt and read the last few paragraphs of the "Perl-like Regular Expressions" section.

thelatemail · Accepted Answer · 2013-01-03 20:21:29Z

4

This ended up being a compacted version of @RomanLuštrik's answer:

gsub("[^0-9]","",sapply(strsplit(mystrings,"/"),"[",2))
[1] "2"  "9"  "22" "22"

answered Jan 3, 2013 at 20:21

thelatemail

94.3k12 gold badges140 silver badges197 bronze badges

Comments

Jim · Accepted Answer · 2014-11-26 20:51:59Z

1

Using rex may make this type of task a little simpler.

matches <- re_matches(mystrings,
  rex(
    "/",
    any,
    capture(name = "numbers", digits)
    )
  )

as.numeric(matches$numbers)
#>[1]  2  9 22 22

answered Nov 26, 2014 at 20:51

Jim

4,80731 silver badges32 bronze badges

Comments

moodymudskipper · Accepted Answer · 2019-11-06 12:20:28Z

Using the package unglue you could do :

# install.packages("unglue")
library(unglue)

unglue_vec(mystrings, "{x}/{y}/{z}", var = "y")
#> [1] "D2"  "D9"  "D22" "D22"

From a data frame you could use unglue_unnest() so no need to use transform()

df <- data.frame(col = mystrings)
unglue_unnest(df, col, "{x}/{y}/{z}", remove = FALSE)
#>         col   x   y  z
#> 1  X2/D2/F4  X2  D2 F4
#> 2 X10/D9/F4 X10  D9 F4
#> 3 X3/D22/F4  X3 D22 F4
#> 4 X9/D22/F9  X9 D22 F9

# or used unnamed subpatterns to keep only the middle value
unglue_unnest(df, col, "{=.*?}/{y}/{=.*?}", remove = FALSE)
#>         col   y
#> 1  X2/D2/F4  D2
#> 2 X10/D9/F4  D9
#> 3 X3/D22/F4 D22
#> 4 X9/D22/F9 D22

^{Created on 2019-11-06 by the reprex package (v0.3.0)}

More info: https://github.com/moodymudskipper/unglue/blob/master/README.md

Collectives™ on Stack Overflow

Extract string between /

7 Answers 7

1 Comment

Comments

Comments

1 Comment

Comments

Comments

Comments

Your Answer

Linked

Hot Network Questions

Collectives™ on Stack Overflow

7 Answers 7

1 Comment

Comments

Comments

1 Comment

Comments

Comments

Comments

Your Answer

Sign up or log in

Post as a guest

Linked

Related