2

I need an an elegant way in R to take a urlencoded string such as: test.aspx?title=Pancreas%20Cancer&startYear=2000&dataCSV=461%2C520%2C559%2C627%2C1065%2C1217%2C1323%2C1512%2C2160%2C2554%2C2707%2C3000%2C3495%2C4163%2C4927%2C5237%2C4785%2C5559%2C6490%2C7559%2C5106%2C6358%2C6824%2C7980%2C4873%2C6715%2C7038%2C8156%2C4863%2C6460%2C7244%2C9161%2C5237%2C6176%2C6531%2C9068%2C5628%2C5983%2C5871%2C7951%2C6060%2C6089%2C5520%2C6627%2C5722%2C6099%2C5586%2C5822%2C4185%2C4909%2C5053%2C5273%2C5227%2C6077%2C6681%2C7977%2C

And get a list back that uses the keys sent in the URL the values sent:

list ( title = "Pancreas Cancer",
       startYear = 2000,
       dataCSV = matrix ( c(461, 520....etc), nrows = X, byrow = TRUE)
 );

Is there anyway to do this easily in R?

3
  • Are you using httr? Commented Dec 10, 2013 at 21:26
  • utils::URLdecode() will get you started. Commented Dec 10, 2013 at 21:27
  • URLdecode was my first thought but it just takes out the url encoded characters like %2c and replaces them with their "readable" value like comma or space. I need something that actually parses that string into key value pairs. Commented Dec 11, 2013 at 11:06

2 Answers 2

3

httr provides a function to parse urls, of which this is an example:

url <- "test.aspx?title=Pancreas%20Cancer&startYear=2000&dataCSV=461%2C520%2C559%2C627%2C1065%2C1217%2C1323%2C1512%2C2160%2C2554%2C2707%2C3000%2C3495%2C4163%2C4927%2C5237%2C4785%2C5559%2C6490%2C7559%2C5106%2C6358%2C6824%2C7980%2C4873%2C6715%2C7038%2C8156%2C4863%2C6460%2C7244%2C9161%2C5237%2C6176%2C6531%2C9068%2C5628%2C5983%2C5871%2C7951%2C6060%2C6089%2C5520%2C6627%2C5722%2C6099%2C5586%2C5822%2C4185%2C4909%2C5053%2C5273%2C5227%2C6077%2C6681%2C7977%2C"

library(httr)
str(parse_url(url)$query)
Sign up to request clarification or add additional context in comments.

2 Comments

Ah, parse_url is what I was trying to remember, should have since I've used it before, thanks @hadley
I'll give this a try in the code later today and post my results.
1

This is a start, not very general really either probably

library(RCurl)
string <- "http://stuff.com/test.aspx?title=Pancreas%20Cancer&startYear=2000&dataCSV=461%2C520%2C559%2C627%2C1065%2C1217%2C1323%2C1512%2C2160%2C2554%2C2707%2C3000%2C3495%2C4163%2C4927%2C5237%2C4785%2C5559%2C6490%2C7559%2C5106%2C6358%2C6824%2C7980%2C4873%2C6715%2C7038%2C8156%2C4863%2C6460%2C7244%2C9161%2C5237%2C6176%2C6531%2C9068%2C5628%2C5983%2C5871%2C7951%2C6060%2C6089%2C5520%2C6627%2C5722%2C6099%2C5586%2C5822%2C4185%2C4909%2C5053%2C5273%2C5227%2C6077%2C6681%2C7977%2C"
string <- URLdecode(string)
string <- strsplit(string, "\\?")[[1]][[2]]
lapply(strsplit(string, "&")[[1]], function(x){
  tmp <- strsplit(x, "=")
  val <- tmp[[1]][[2]]
  names(val) <- tmp[[1]][[1]]
  as.list(val)
})

[[1]]
[[1]]$title
[1] "Pancreas Cancer"


[[2]]
[[2]]$startYear
[1] "2000"


[[3]]
[[3]]$dataCSV
[1] "461,520,559,627,1065,1217,1323,1512,2160,2554,2707,3000,3495,4163,4927,5237,4785,5559,6490,7559,5106,6358,6824,7980,4873,6715,7038,8156,4863,6460,7244,9161,5237,6176,6531,9068,5628,5983,5871,7951,6060,6089,5520,6627,5722,6099,5586,5822,4185,4909,5053,5273,5227,6077,6681,7977,"

2 Comments

Thanks! I was hoping R had some package I was unaware of that made this more concise, but this works great!
The last step for readers is to take the dataCSV and convert it into a vector. You do that like this: c(as.numeric(strsplit(outputFromLapply[[3]][[1]],',')[[1]])); Then you convert that to a matrix by passing that output to the matrix method and setting ncol, nrow, and by row = TRUE

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.