1

I am looking for a way to replace all _ (by say '') in each of the following characters

x <- c('test_(match)','test_xMatchToo','test_a','test_b') 

if and only if _ is followed by ( or x. So the output wanted is:

x <- c('test(match)','testxMatchToo','test_a','test_b') 

How can this be done (using any package is fine)?

4
  • Easiest way I can think of is just to replace _( and _x with '' without using any regular expressions at all - It'll be faster and easier to read too. Commented Nov 15, 2016 at 17:58
  • Oh sorry, do this - Replace _( with ( and _x with x Commented Nov 15, 2016 at 18:00
  • 1
    John's suggestion, gsub("_([(x])", "\\1", x), seems generic enough to me, though this is not "without using any regex", so maybe I'm misunderstanding. Commented Nov 15, 2016 at 18:02
  • yeah not sure that's what he meant... but maybe I'm misunderstanding think he wanted to replace both strings by new strings and this was a minimal example, I can get manymore chatacters to filter... Commented Nov 15, 2016 at 18:12

1 Answer 1

5

Using a lookahead:

_(?=[(x])

What a lookahead does is assert that the pattern matches, but does not actually match the pattern it's looking ahead for. So, here, the final match text consists of only the underscore, but the lookahead asserts that it's followed by an x or (.

Demo on Regex101

Your R code would look a bit like this (one arg per line for clarity):

gsub(
    "_(?=[(x])",                            # The regex
    "",                                     # Replacement text
    c("your_string", "your_(other)_string"), # Vector of strings
    perl=TRUE                               # Make sure to use PCRE
)
Sign up to request clarification or add additional context in comments.

Comments

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.