-1

Below is the sample of the data in a column Region, i need to remove " (more info)" from the data.

  1. Sri Lanka (more info)
  2. Guyana (more info)
  3. Mongolia
  4. Kazakhstan (more info)
  5. Suriname

Tried : as.character(gsub( "[\\ (more info)]", "", States$Regions))-> abc This is not giving proper result.

Expected Result:

  1. Sri Lanka
  2. Guyana
  3. Mongolia
  4. Kazakhstan
  5. Suriname
3
  • @Gregor, Thanks gsub( " \\(more info\\)", "", States$Regions)-> States$Regions working fine Commented Apr 2, 2018 at 20:08
  • Possible duplicate of Remove part of a string in dataframe column (R) Commented Apr 2, 2018 at 22:02
  • ** @Tjebo, This is not a duplicate** Commented Apr 2, 2018 at 22:53

1 Answer 1

2

A few things wrong.

1) Don't use brackets here. In regex, [abc] matches a or b or c. You want to match the whole pattern, so don't use brackets. (You could use parenthesis, but it it not necessary

"\\ (more info)"  # fix 1: no brackets

2) You seem to know backslashes are used to escape things in regex. But they must be next to what they are escaping! Here you are escaping a space, which is meaningless. You need to escape both parentheses that are part of your pattern:

"\\(more info\\)"  # fix 2: escape parens

3) You still need the space, but it goes at the front, before the (escaped) parenthesis:

" \\(more info\\)"  # fix 3: space at beginning 

Now the pattern should work. Also note that gsub returns a character, so your as.character is redundant.

I'd strongly recommend using a site like regex101.com to debug regex. You only need single \ to escape there, but other than that it is just like R. Here's your example. Check out the sidebar for nice explanations.

Sign up to request clarification or add additional context in comments.

Comments

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.