7

I have a column that contains JSON data as in the following example,

library(data.table)
test <- data.table(a = list(1,2,3), 
           info = list("{'duration': '10', 'country': 'US'}", 
                       "{'duration': '20', 'country': 'US'}",
                       "{'duration': '30', 'country': 'GB', 'width': '20'}"))

I want to convert the last column to equivalent R storage, which would look similar to,

res <- data.table(a = list(1, 2, 3),
                  duration = list(10, 20, 30),
                  country = list('US', 'US', 'GB'),
                  width = list(NA, NA, 20))

Since I have 500K rows with different contents I would look for a quick way to do this.

1
  • 1
    Ok, feel free to edit if you know how to correct it in some way that doesn't break the answer. Commented Oct 24, 2016 at 19:21

2 Answers 2

9

A variation without the need to separate out the JSON string

library(data.table)
library(jsonlite)

test[, info := gsub("'", "\"", info)]
test[, rbindlist(lapply(info, fromJSON), use.names = TRUE, fill = TRUE)]

#    duration country width
# 1:       10      US    NA
# 2:       20      US    NA
# 3:       30      GB    20
Sign up to request clarification or add additional context in comments.

1 Comment

nice data table solution! Exactly what I was looking for.
4

Parse the JSON first, then build the data.frame (or data.table):

json_string <- paste(c("[{'duration': '10', 'country': 'US'}", 
    "{'duration': '20', 'country': 'US'}",
  "{'duration': '30', 'country': 'GB'}",
  "{'width': '20'}]"), collapse=", ")

# JSON standard requires double quotes
json_string <- gsub("'", "\"", json_string)

library("jsonlite")
fromJSON(json_string)

#  duration country width
# 1       10      US  <NA>
# 2       20      US  <NA>
# 3       30      GB  <NA>
# 4     <NA>    <NA>    20

This isn't exactly what you asked for as your JSON doesn't associate 'width' with the previous record, you might need to do some manipulation first:

json_string <- paste(c("[{'duration': '10', 'country': 'US'}", 
    "{'duration': '20', 'country': 'US'}",
  "{'duration': '30', 'country': 'GB', 'width': '20'}]"), 
  collapse=", ")

json_string <- gsub("'", "\"", json_string)
df <- jsonlite::fromJSON(json_string)
data.table::as.data.table(df)

#    duration country width
# 1:       10      US    NA
# 2:       20      US    NA
# 3:       30      GB    20

1 Comment

consider using setDT(df) in place of data.table::as.data.table(df)

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.