17

How do I come from here ...

| ID | JSON Request                                                          |
==============================================================================
|  1 | {"user":"xyz1","weightmap": {"P1":0,"P2":100}, "domains":["a1","b1"]} |
------------------------------------------------------------------------------
|  2 | {"user":"xyz2","weightmap": {"P1":100,"P2":0}, "domains":["a2","b2"]} |
------------------------------------------------------------------------------

to here (The requirement is to make a table of JSON in column 2):

| User | P1 | P2 | domains | 
============================
| xyz1 |  0 |100 | a1, b1  |
----------------------------
| xyz2 |100 | 0  | a2, b2  |
----------------------------

Here is the code to generate the data.frame:

raw_df <- 
  data.frame(
    id   = 1:2,
    json = 
      c(
        '{"user": "xyz2", "weightmap": {"P1":100,"P2":0}, "domains": ["a2","b2"]}', 
        '{"user": "xyz1", "weightmap": {"P1":0,"P2":100}, "domains": ["a1","b1"]}'
      ), 
    stringsAsFactors = FALSE
  )
1
  • 1
    Check out the Jsonlite package. It reads Json's into a nested list, which you can then easily recast as data.frames. Commented Feb 1, 2017 at 20:19

5 Answers 5

33

Here's a tidyverse solution (also using jsonlite) if you're happy to work in a long format (for domains in this case):

library(jsonlite)
library(dplyr)
library(purrr)
library(tidyr)

d <- data.frame(
  id = c(1, 2),
  json = c(
    '{"user":"xyz1","weightmap": {"P1":0,"P2":100}, "domains":["a1","b1"]}',
    '{"user":"xyz2","weightmap": {"P1":100,"P2":0}, "domains":["a2","b2"]}'
  ),
  stringsAsFactors = FALSE
)

d %>% 
  mutate(json = map(json, ~ fromJSON(.) %>% as.data.frame())) %>% 
  unnest(json)
#>   id user weightmap.P1 weightmap.P2 domains
#> 1  1 xyz1            0          100      a1
#> 2  1 xyz1            0          100      b1
#> 3  2 xyz2          100            0      a2
#> 4  2 xyz2          100            0      b2
  • mutate... is converting from a string to column of nested data frames.
  • unnest... is unnesting these data frames into multiple columns
Sign up to request clarification or add additional context in comments.

1 Comment

Instead of as.data.frame, one should now use as_tibble.
6

I would go for the jsonlite package in combination with the usage of mapply, a transformation function and data.table's rbindlist.

# data 
raw_df <- data.frame(id = 1:2, json = c('{"user": "xyz2", "weightmap": {"P1":100,"P2":0}, "domains": ["a2","b2"]}', '{"user": "xyz1", "weightmap": {"P1":0,"P2":100}, "domains": ["a1","b1"]}'), stringsAsFactors = FALSE)

# libraries
library(jsonlite)
library(data.table)


# 1) First, make a transformation function that works for a single entry
f <- function(json, id){
  # transform json to list
  tmp    <- jsonlite::fromJSON(json)

  # transform list to data.frame
  tmp    <- as.data.frame(tmp)

  # add id
  tmp$id <- id

  # return
  return(tmp)
}


# 2) apply it via mapply 
json_dfs <- 
  mapply(f, raw_df$json, raw_df$id, SIMPLIFY = FALSE)


# 3) combine the fragments via rbindlist
clean_df <- 
  data.table::rbindlist(json_dfs)

# 4) et-voila
clean_df
##    user weightmap.P1 weightmap.P2 domains id
## 1: xyz2          100            0      a2  1
## 2: xyz2          100            0      b2  1
## 3: xyz1            0          100      a1  2
## 4: xyz1            0          100      b1  2

Comments

4

Could not get the flatten parameter to work as I expected so needed to unlist and then "re-list" before rbinding with do.call:

library(jsonlite)
 do.call( rbind, 
          lapply(raw_df$json, 
                  function(j) as.list(unlist(fromJSON(j, flatten=TRUE)))
        )       )
     user   weightmap.P1 weightmap.P2 domains1 domains2
[1,] "xyz2" "100"        "0"          "a2"     "b2"    
[2,] "xyz1" "0"          "100"        "a1"     "b1"    

Admittedly, this will require further processing since it coerces all the lines to character.

1 Comment

+1 for usage of as much base R as possible - although data.table::rbindlist will outperform do.call( rbind, ... by several levels
0
library(jsonlite)
json = c(
        '{"user":"xyz1","weightmap": {"P1":0,"P2":100}, "domains":["a1","b1"]}',
        '{"user":"xyz2","weightmap": {"P1":100,"P2":0}, "domains":["a2","b2"]}'
        )

json <- lapply( paste0("[", json ,"]"), 
                function(x) jsonlite::fromJSON(x))

df <- data.frame(matrix(unlist(json), nrow=2, ncol=5, byrow=T))

df <- df %>% unite(Domains, X4, X5, sep = ", ") 
colnames(df) <- c("user", "P1", "P2", "domains")
head(df)

The output is:

  user  P1  P2 domains
1 xyz1   0 100  a1, b1
2 xyz2 100   0  a2, b2

Comments

0

Using tidyjson https://cran.r-project.org/web/packages/tidyjson/vignettes/introduction-to-tidyjson.html

install.packages("tidyjson")

library(tidyjson)

json_as_df <- raw_df$json %>% spread_all

# retain columns
json_as_df <- raw_df %>% as.tbl_json(json.column = "json") %>% spread_all

Comments

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.