0

I am trying to convert a JSON response file using the JSONlite package, but am running into a few issues in the way the data is being presented in a data frame.

Using this code:


library(dplyr)
library(jsonlite)


json_data <- fromJSON("File Path")
df <-data.frame(unlist(json_data))

JSON Response looks like:

{
    "totalPages": 1,
    "firstPage": true,
    "lastPage": true,
    "numberOfElements": 1757,
    "number": 0,
    "totalElements": 1757,
    "columns": {
        "dimension": {
            "id": "variables/evar4",
            "type": "string"
        },
        "columnIds": [
            "1",
            "2",
            "3",
            "4"
        ]
    },
    "rows": [
        {
            "itemId": "0",
            "value": "Unspecified",
            "data": [
                220,
                1.0002442201269182,
                0.29499089253187616,
                2.890909090909091
            ]
        },
        {
            "itemId": "118293442",
            "value": "PNIwTjWWjphkqfu",
            "data": [
                5,
                1.0002442201269182,
                57.879999999999995,
                30.8
            ]
        },
        {
            "itemId": "1810135314",
            "value": "PNIFBOIKLplumdb",
            "data": [
                3,
                1.0002442201269182,
                1243.0277777777778,
                545.3333333333334
            ]
        }
    ]
}

I receive a table df that has columns and rows all in one list. Is there a way to have the column ids going across the top, with the corresponding rowIDs matched to the proper column?

So that the data frame looks like:


                    1     2    3     4  
Unspecified        220  1.00 0.294   2.89
PNIwTjWWjphkqfu     5   1.00 57.87   30.8
PNIFBOIKLplumdb     3   1.00 1243.0  545.33

Any help on this would be greatly appreciated

1
  • Please check your post; your JSON file is not complete and as such is not in a valid JSON format. Commented Apr 17, 2019 at 4:45

1 Answer 1

1

Please note that your JSON file as posted is incomplete and missing closing brackets.

One option to reproduce your expected output is to unnest data in the rows element of the fromJSON output.

library(jsonlite)
json_data <- fromJSON("sample.json")

library(tidyverse)
json_data$rows %>%
    select(-itemId) %>%
    unnest() %>%
    group_by(value) %>%
    mutate(n = 1:n()) %>%
    spread(n, data) %>%
    ungroup()

## A tibble: 3 x 5
#  value             `1`   `2`      `3`    `4`
#  <chr>           <dbl> <dbl>    <dbl>  <dbl>
#1 PNIFBOIKLplumdb     3  1.00 1243.    545.
#2 PNIwTjWWjphkqfu     5  1.00   57.9    30.8
#3 Unspecified       220  1.00    0.295   2.89

Explanation: I recommend going through the code line-by-line to understand what each command does. In short: json_data$rows is a data.frame; we select all columns except itemId, and unnest the data in list column data. This results in your data being in long format, so in order to reproduce your expected we need to reshape the data from long to wide. To do so, we group entries by value, add a counter and use spread to give the final output.


Content of file "sample.json"

{
    "totalPages": 1,
    "firstPage": true,
    "lastPage": true,
    "numberOfElements": 1757,
    "number": 0,
    "totalElements": 1757,
    "columns": {
        "dimension": {
            "id": "variables/evar4",
            "type": "string"
        },
        "columnIds": [
            "1",
            "2",
            "3",
            "4"
        ]
    },
    "rows": [
        {
            "itemId": "0",
            "value": "Unspecified",
            "data": [
                220,
                1.0002442201269182,
                0.29499089253187616,
                2.890909090909091
            ]
        },
        {
            "itemId": "118293442",
            "value": "PNIwTjWWjphkqfu",
            "data": [
                5,
                1.0002442201269182,
                57.879999999999995,
                30.8
            ]
        },
        {
            "itemId": "1810135314",
            "value": "PNIFBOIKLplumdb",
            "data": [
                3,
                1.0002442201269182,
                1243.0277777777778,
                545.3333333333334
            ]
        }
    ]
}
Sign up to request clarification or add additional context in comments.

2 Comments

Thank you for this response, it worked perfectly. Could you please walk me through how this code works?
Done @AlokMadan ;-)

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.