1

I have a bunch of strings which look like this:

 [3] "  3. Wiki: Los Angeles 3:58pm; score:1.959502"        
 [4] "  4. Wiki: Boston 6:58pm; score:1.959502"             
 [5] "  5. Disambiguation: 'Boon; score:1.934644"            
 [6] "  6. Wiki: The Note (album)\"; score:1.786931"          

I parse them into a data frame like this:

read.csv(text=sub("^  [0-9]*\\. (Wiki|Disambiguation): (.*); score:([0-9\\.]*)$","\"\\2\",\\3",ll),
         header=FALSE,stringsAsFactors=FALSE)

the trouble is that the \\2 text which I enclose in quotes may contains quotes (double and single) itself.

How do I deal with this?

1
  • does changing the double to single in ,"\"\\2\", to ,"\'\\2\' help? the sub still works sub("^ [0-9]*\\. (Wiki|Disambiguation): (.*); score:([0-9\\.]*)$","\'\'\'''2''\',\\3", 'hello') Commented Feb 14, 2014 at 2:53

1 Answer 1

1

Just remove the double quotes:

ll <-  gsub('"', '', ll)

NOTE: Changed answer after poster gave an example of how it goes wrong.

Sign up to request clarification or add additional context in comments.

1 Comment

I added quotes to the places where they cause me grief

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.