0

What's the correct way to traverse and modify a JSON string in C?

Specifically, I have a string, body_buf. When printed out

print("length: %d\n%.*s\n", body_len, body_len, body_buf);

It looks like this:

length: 113
{"field1":"something","whatever":10,"description":"body","id":"random","__oh__":{"session":"12345678jhgfdrtyui"}}

Another more complicated body_buf may look like this:

{"status":1,"query":{},"proc":{"memory":{"total":17177939968,"cmax":18363625472,"amax":20000000000},"cpu":{"cores":[0.788,0.132,0.319,2.951,10.111,3.309,1.43,0.8,2.705,4.203,2.32,2,0.019,0.172,0.247,3.888,0.282,0.423,5.254,0.258,0.009,0.369,3.277,0.048,0.283,7.574,3.086,1.592,0.191,0.166,4.348,0.391,0.085,0.25,7.12,4.927,3.671,1.147,3.216,4.628,0.131,0.995,0.744,4.252,4.022,3.505,3.758,3.491],"total":108.886,"limit":800},"disk":{"used":20170,"limit":50000,"io_limit":500}}}

I want to simplify body_buf (which also doubles as removing sensitive information) according to the following rules, only modifying the values, not any of the keys:

  1. Strings become the length of strings.
  2. Array of strings becomes [array_len, max_len, min_len].
  3. Array of numbers becomes [array_len, max, min].

I'm not familiar with working with JSON strings in C. What's the best way to do this?

I can treat body_buf as a string and traverse through it, modifying whatever comes after a ":", because those are bound to be the values I might modify, depending on the type. For arrays, I need to keep track of anything that are sandwiched between "[" and "]". This could work but doesn't seem very straightforward.

Alternatively, perhaps convert the body_buf to a JSON type and then traverse through the nested structure. But then I also have to modify it. I have yet to find a C example (which would be helpful) using json-c or otherwise that traverses and modifies (or create a new one via some kind of deep copy?) a JSON object.

Details (rules above, 1-3) aside, this should be a relatively common operation -- to traverse and modify. So for those more attuned to the intricacies and good/standard practices of json-c or JSON manipulation in general in C, I'm looking for some pointers.

Again, I have json-c:

#include "cJSON.h"
#include "cJSON_Utils.h"
#include <libjson/json.h>
#include <libjson/json_tokener.h>

Relevant information I've looked at so far include the following:

https://gist.github.com/alan-mushi/19546a0e2c6bd4e059fd

How to get json values after json_tokener_parse()?

Parsing deeply nested JSON key using json-c

14
  • sorry, I don't quite understand what you mean, I guess you want to simplify "cores":{...}?Do you want to change it to something like "cores":["array_len":N, "max_len":3, "min_len":1] ? Commented Jan 27, 2021 at 0:51
  • Simplify everything, all the values. So string {"Name":"Tom", "Age":18, "Address": "California", "arr": [1,2,3,4,5]} becomes {"Name": 3,"Age":18,"Address":10,"arr":[5, 5, 1]}, according to the rules. @yanzhang.guo thanks for the query. Commented Jan 27, 2021 at 0:58
  • @yanzhang.guo Does that make sense? Commented Jan 27, 2021 at 1:10
  • Yes, can you parse old string and rearrange it into {"Name": 3,"Age":18,"Address":10,"arr":[5, 5, 1]}?Is this what you want? Commented Jan 27, 2021 at 1:22
  • 3
    Clarifying your intentions is not the problem. Focusing the question so that it affords good answers that are neither "here's your code" nor "here's a complete tutorial on JSON-C" is what is needed. JSON-C has fairly complete documentation, with links to tutorials. It's API is a typical one: it parses JSON data to an object representation with tree-like structure, which representation affords traversing the tree, modifying it, and converting the result back to a string. Do some research. Commented Jan 27, 2021 at 23:06

1 Answer 1

1
+50

I don't know how "simplify" the json will be useful. Using json in c can be scary the first time.

I like cJSON library, it is light, portable and stable. It has a good test coverage, and the license is MIT.

I think this code using the library cJSON will do what you asked:

#include <cjson/cJSON.h>
#include <stdbool.h>
#include <string.h>
#include <stdio.h>
#include <limits.h>
#include <float.h>

const char json1[] = "{\"field1\":\"something\",\"whatever\":10,\"description\":\"body\",\"id\":\"random\",\"__oh__\":{\"session\":\"12345678jhgfdrtyui\"}}";
const char json2[] = "{\"status\":1,\"query\":{},\"proc\":{\"memory\":{\"total\":17177939968,\"cmax\":18363625472,\"amax\":20000000000},\"cpu\":{\"cores\":[0.788,0.132,0.319,2.951,10.111,3.309,1.43,0.8,2.705,4.203,2.32,2,0.019,0.172,0.247,3.888,0.282,0.423,5.254,0.258,0.009,0.369,3.277,0.048,0.283,7.574,3.086,1.592,0.191,0.166,4.348,0.391,0.085,0.25,7.12,4.927,3.671,1.147,3.216,4.628,0.131,0.995,0.744,4.252,4.022,3.505,3.758,3.491],\"total\":108.886,\"limit\":800},\"disk\":{\"used\":20170,\"limit\":50000,\"io_limit\":500}}}";
const char json3[] = "{\"Name\":\"Tom\",\"Age\":18,\"Address\":\"California\",\"arr\":[1,2,3,4,5]}";

static void simplifyArray(cJSON *input, cJSON *output)
{  
    cJSON *item;
    size_t noElems = 0;
    
    if (cJSON_IsString(cJSON_GetArrayItem(input, 0))) {
        size_t max, min;
        max = 0;
        min = UINT_MAX;
        cJSON_ArrayForEach(item, input) {
            noElems++;
            size_t len = strlen(cJSON_GetStringValue(item));
            if (len > max) max = len;
            if (len < min) min = len;
        }
        cJSON *newArray = cJSON_AddArrayToObject(output, input->string);
        cJSON_AddItemToArray(newArray, cJSON_CreateNumber(noElems));
        cJSON_AddItemToArray(newArray, cJSON_CreateNumber(max));
        cJSON_AddItemToArray(newArray, cJSON_CreateNumber(min));

    } else if (cJSON_IsNumber(cJSON_GetArrayItem(input, 0))) {
        double max, min;
        max = -DBL_MAX;
        min = DBL_MAX;
        cJSON_ArrayForEach(item, input) {
            noElems++;
            double value = item->valuedouble;
            if (value > max) max = value;
            if (value < min) min = value;
        }
        cJSON *newArray = cJSON_AddArrayToObject(output, input->string);
        cJSON_AddItemToArray(newArray, cJSON_CreateNumber(noElems));
        cJSON_AddItemToArray(newArray, cJSON_CreateNumber(max));
        cJSON_AddItemToArray(newArray, cJSON_CreateNumber(min));
    }
}

static void simplify(cJSON *input, cJSON *output)
{
    cJSON *elem;
    for (elem = input; elem != NULL; elem = elem->next) {
        if (cJSON_IsString(elem)) {
            cJSON_AddNumberToObject(output, elem->string, strlen(cJSON_GetStringValue(elem)));
        } else if (cJSON_IsArray(elem)) {
            simplifyArray(elem, output);
        } else if (cJSON_IsObject(elem)) {
            cJSON *newOutput = cJSON_AddObjectToObject(output, elem->string);
            simplify(elem->child, newOutput);
        } else {
            cJSON *dup = cJSON_Duplicate(elem, true);
            cJSON_AddItemToObject(output, elem->string, dup);
        }
    }
}

static void simplifyAndPrint(const char *json)
{
    cJSON *input = cJSON_Parse(json);
    cJSON *output = cJSON_CreateObject();
    simplify(input->child, output);
    printf("%s\n", cJSON_PrintUnformatted(output));
    cJSON_Delete(input);
    cJSON_Delete(output);
}

int main()
{
    simplifyAndPrint(json1);
    simplifyAndPrint(json2);
    simplifyAndPrint(json3);
    return 0;
}

The output:

{"field1":9,"whatever":10,"description":4,"id":6,"__oh__":{"session":18}}
{"status":1,"query":{},"proc":{"memory":{"total":17177939968,"cmax":18363625472,"amax":20000000000},"cpu":{"cores":[48,10.111,0.009],"total":108.886,"limit":800},"disk":{"used":20170,"limit":50000,"io_limit":500}}}
{"Name":3,"Age":18,"Address":10,"arr":[5,5,1]}

In the example above I preferred don't alter the input JSON, if you don't care about this you can use the funcion cJSON_ReplaceItemInObject to substitute the node.

P.S.: I am assuming arrays contain only strings and numbers, and don't mix it, because there is no rule to handle other array configurations.

P.S.2: This code is using the version of the library present in Ubuntu 20.04, if you download the library from GitHub the version will contain more features.

Sign up to request clarification or add additional context in comments.

3 Comments

This is perfect -- a clean and understandable example. With the lack of good existing examples on the subject anywhere (none that really helped me), this will be beneficial to others looking for an intro (crash course, really, because this captures pretty much the gist of everything one needs, the rhythm of it) to working with JSON in C (quite intimidating when not sure just what to use when and what's the proper/conventional way of doing things, especially if they are more used to a higher-level language). Yes, in my specific example, arrays are homogeneous and of either strings or numbers.
The background here is I'm collecting data for API discovery, and it is only the schema and some basic information that I need. This also roots out (at least reduces) potentially sensitive information in the data.
stackoverflow.com/questions/65097945/… Hi Matheus, does this sound like something you'd know the answer of as well? Cool if not.

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.