1

I want to convert a json-file into a dataframe in R. With the following code:

link <- 'https://www.dropbox.com/s/ckfn1fpkcix1ccu/bevingenbag.json'
document <- fromJSON(file = link, method = 'C')
bev <- do.call("cbind", document)

i'm getting this:

    type                features
1   FeatureCollection   list(type = "Feature", geometry = list(type = "Point", coordinates = c(6.54800000288927, 52.9920000044505)), properties = list(gid = "1496600", yymmdd = "19861226", lat = "52.992", lon = "6.548", mag = "2.8", depth = "1.0", knmilocatie = "Assen", baglocatie = "Assen", tijd = "74751"))

which is the first row of a matrix. All the other rows have the same structure. I'm interested in the properties = list(gid = "1496600", yymmdd = "19861226", lat = "52.992", lon = "6.548", mag = "2.8", depth = "1.0", knmilocatie = "Assen", baglocatie = "Assen", tijd = "74751") part, which should be converted into a dataframe with the columns gid, yymmdd, lat, lon, mag, depth, knmilocatie, baglocatie, tijd.

I searched for and tryed several solutions but none of them worked. I used the rjson package for this. I also tryed the RJSONIO & jsonlite package, but was unable to extract the desired information.

Anyone an idea how to solve this problem?

5
  • 1
    Please provide a reproducible example. Commented Jan 14, 2014 at 19:04
  • Which package do you use? RJSONIOor rjson? Commented Jan 14, 2014 at 19:18
  • 1
    Does this answer your question? Commented Jan 14, 2014 at 19:22
  • @jlhoward no, I tryed that solution Commented Jan 14, 2014 at 19:39
  • @SvenHohenstein I edited the question and included a link to the json file. Commented Jan 14, 2014 at 19:40

2 Answers 2

4

Here's a way to obtain the data frame:

library(rjson)
document <- fromJSON(file = "bevingenbag.json", method = 'C')

dat <- do.call(rbind, lapply(document$features, 
                             function(x) data.frame(x$properties)))

Edit: How to replace empty values with NA:

dat$baglocatie[dat$baglocatie == ""] <- NA

The result:

head(dat)

      gid   yymmdd    lat   lon mag depth knmilocatie baglocatie   tijd
1 1496600 19861226 52.992 6.548 2.8   1.0       Assen      Assen  74751
2 1496601 19871214 52.928 6.552 2.5   1.5   Hooghalen  Hooghalen 204951
3 1496602 19891201 52.529 4.971 2.7   1.2   Purmerend    Kwadijk 200914
4 1496603 19910215 52.771 6.914 2.2   3.0       Emmen      Emmen  21116
5 1496604 19910425 52.952 6.575 2.6   3.0   Geelbroek    Ekehaar 102631
6 1496605 19910808 52.965 6.573 2.7   3.0     Eleveld      Assen  40114
Sign up to request clarification or add additional context in comments.

1 Comment

Thanx, it works :-) One more question: some of the values in the column baglocatie are empty. How do I replace them with NA's or the value from knmilocatie?
4

This is just another, quite similar, approach.

@SvenHohenstein's approach creates a dataframe at each step, an expensive process. It's much faster to create vectors and re-type the whole result at the end. Also, Sven's approach makes each column a factor, which might or might not be what you want. The approach below runs about 200 times faster. This can be important if you intend to do this repeatedly. Finally, you will need to convert columns lon, lat, mag, and depth to numeric.

library(microbenchmark)
library(rjson)

document <- fromJSON(file = "bevingenbag.json", method = 'C')

json2df.1 <- function(json){   # @SvenHohenstein approach
  df <- do.call(rbind, lapply(json$features, 
                       function(x) data.frame(x$properties, stringsAsFactors=F)))
  return(df)
}

json2df.2 <- function(json){
  df <- do.call(rbind,lapply(json[["features"]],function(x){c(x$properties)}))
  df <- data.frame(apply(result,2,as.character), stringsAsFactors=F)
  return(df)
}

microbenchmark(x<-json2df.1(document), y<-json2df.2(document), times=10)
# Unit: milliseconds
#                     expr        min         lq     median         uq        max neval
#  x <- json2df.1(document) 2304.34378 2654.95927 2822.73224 2977.75666 3227.30996    10
#  y <- json2df.2(document)   13.44385   15.27091   16.78201   18.53474   19.70797    10
identical(x,y)
# [1] TRUE

1 Comment

why there is no function similar to XMLtoDataFrame for JSON (just like when XML is parsed)?

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.