From the string
s <- "|tree| Lorem ipsum dolor sit amet, |house| consectetur adipiscing elit,
|street| sed do eiusmod tempor incididunt ut labore et |car| dolore magna aliqua."
I want to extract the text after the letters within the |-symbols.
My approach:
words <- list("tree","house","street","car")
for(word in words){
expression <- paste0("^.*\\|",word,"\\|\\s*(.+?)\\s*\\|.*$")
print(sub(expression, "\\1", s))
}
This works fine for all but the last wortd car. It instead returns the entire string s.
How can I modify the regex such that for the last element of words-list in prints out dolore magna aliqua..
\Edit: Previously the list with expressions was a,b,c,d. Solutions to this specific problem cannot be generalized very well.
subin these cases confusing, since you have to specify what you DON'T want to keep instead of (the more natural) what you DO want to keep. I'd advise usingstringi::stri_extract_all, for example:stringi::stri_extract_all(regex = "(?<=\\|[abcd]\\| )([^\\|]+)", s). This uses a lookbehind to match the|a|,|b|,|c|and|d|without capturing it.a,b,c,dbut insteadtree,house,street,car. How would I do it?