extract text from string in R

Question

I have a lot of strings that all looking similar, e.g.:

x1= "Aaaa_11111_AA_Whatiwant.txt"
x2= "Bbbb_11111_BBBB_Whatiwanttoo.txt"
x3= "Ccc_22222_CC_Whatiwa.txt"

I would like to extract the: Whatiwant, Whatiwanttoo, and the Whatiwa in R.

I started with substring(x1,15,23), but I don't know how to generalize it. How can I always extract the part between the last _ and the .txt ?

Thank you!

Add the regex tag and you'll get answers in the next 2 minutes. — alexis_laz
– alexis_laz, Commented Feb 26, 2015 at 16:38

NicE · Accepted Answer · 2015-02-26 16:46:36Z

2

You can use regexp capture groups:

gsub(".*_([^_]*)\\.txt","\\1",x1)

enter image description here

edited Feb 26, 2015 at 16:46

answered Feb 26, 2015 at 16:40

NicE

21.5k3 gold badges56 silver badges72 bronze badges

Sign up to request clarification or add additional context in comments.

4 Comments

user1267127 Over a year ago

how do you plot this flowchart ?

NicE Over a year ago

using this (java-script style so can be different) http://www.regexplained.co.uk/, plenty of other sites that do the same

user1267127 Over a year ago

thanks, may I know why you only use .*([^]*)\.txt" to get the flowchart ? if I use entire ".*([^]*)\\.txt","\\1" , I get something different :-p

NicE Over a year ago

because the website takes javascript style regexp

David Arenburg · Accepted Answer · 2015-02-26 17:35:10Z

0

You can also use the stringr library with funtions like str_extract (and many other possibilities) only in case you don't get into regular expressions. It is extremely easy to use

x1= "Aaaa_11111_AA_Whatiwant.txt"
x2= "Bbbb_11111_BBBB_Whatiwanttoo.txt"
x3= "Ccc_22222_CC_Whatiwa.txt"
library(stringr)
patron <- "(What)[a-z]+"
str_extract(x1, patron)
## [1] "Whatiwant"
str_extract(x2, patron)
## [1] "Whatiwanttoo"
str_extract(x3, patron)
## [1] "Whatiwa"

edited Feb 26, 2015 at 17:35

David Arenburg

92.4k18 gold badges145 silver badges202 bronze badges

answered Feb 26, 2015 at 17:17

Antonio Rodriguez Franco

1707 bronze badges

Collectives™ on Stack Overflow

extract text from string in R

2 Answers 2

4 Comments

Comments

Your Answer

Hot Network Questions

Collectives™ on Stack Overflow

2 Answers 2

4 Comments

Comments

Your Answer

Sign up or log in

Post as a guest

Related