1

I need to index all the rows that have a string beginning with either "B-" or "B^" in one of the columns. I tried a bunch of combinations, but I am suspecting it might not be working due to "-" and "^" signs being part of grep command as well.

dataset[grep('^(B-|B^)[^B-|B^]*$', dataset$Col1),]

With the above script, rows beginning with "B^" are not being extracted. Please suggest a smart way to handle this.

2 Answers 2

1

You can use the escape \\ command in grep:

dataset[grep('^(B\\-|B\\^)[^B\\-|B\\^]*$', dataset$Col1),]
Sign up to request clarification or add additional context in comments.

1 Comment

That worked perfectly. Thank you!! And thanks for inline edits too :)
0

For further explanation, the ^ matches the beginning of a string as an anchor therefore you have to escape it in the middle of string. The [] are a character class so [^B-|B^]* matches any character that's not a B,-,B, or ^. They are unnecessary here.

The simplified regex is: dataset[grep('^(B-|B\\^)', dataset$Col1),]

1 Comment

I see your point about not needing the [ ] character class. Thank you for the insight!

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.