1

I am trying to reference all nested properties as string regardless of name.

An example of the data looks like this (except with a bunch of columns):

[
    {
        "var_1": "some_string",
        "var_2": {
                "col_1": "x",
                "col_2": "y",
                "col_3": "z"
                },
        "var_3": "another_string"
        
    }
]

I used a yaml to json converter and got the following json but my process to flatten the file does not seem to get the nested information.

{
  "$id": "main.json",
  "$schema": "http://json-schema.org/draft-07/schema#",
  "title": "some data",
  "type": "object",
  "properties": {
    "var_1": {
      "$ref": "another_schema.json#/definitions/var_1"
    },
    "var_2": {
      "type": "object",
      "properties": {
        "fieldNames": {
          "uniqueItems": true,
          "type": "array",
          "items": {
            "type": "string"
          }
        }
      }
    },
    "var_3": {
      "type": "string",
      "description": "another variable"
    }
  }
}

Is there another way to reference all the variables/items inside of fields/fieldNames (col_1, col_2, col_3)

3
  • 1
    Your schema isn't valid JSON (just count your braces).... Also, in the schema you define the top level to be an object but your content starts with a [, which is an array. And your property var2 is defined as an array but your content has it being an object (by means of {}). And, could you elaborate on what you mean by "reference all the variables/items"? Commented Nov 4, 2022 at 11:18
  • @DanielSchneider I have updated the json to be valid. I want to be able to create a schema for all the columns 1, 2, 3 without explicitly calling them. Commented Nov 4, 2022 at 15:01
  • Provided an answer below -- let me know if that wasn't what you were after. Commented Nov 6, 2022 at 13:40

1 Answer 1

1

I assume that you want to enforce that all properties under var_2 are of type string. I can think of 2 ways of doing that:

  1. Define additionalProperties with additional constraints, concretely "type": "string":
      "var_2": {
        "type": "object",
        "additionalProperties": {
          "type": "string"
         }
      },
  1. Use of patternProperties matching all field names (".*"). Here you define constraints for a regex matching against the field names (.* will match all field names), concretely:
      "var_2": {
        "type": "object",
        "patternProperties": {
          ".*": {
            "type": "string"
          }
        },
        "additionalProperties": false
      },

Putting both into one schema (and adding the fact that your content starts with an array) would give you this:

{
  "$id": "main.json",
  "$schema": "http://json-schema.org/draft-07/schema#",
  "type": "array",
  "items": {
    "title": "some data",
    "type": "object",
    "properties": {
      "var_1": {
        "type": "string"
      },
      "var_2a": {
        "type": "object",
        "patternProperties": {
          ".*": {
            "type": "string"
          }
        },
        "additionalProperties": false
      },
      "var_2b": {
        "type": "object",
        "additionalProperties": {
          "type": "string"
         }
      },
      "var_3": {
        "type": "string"
      }
    },
    "additionalProperties": false
  }
}

Will validate this:

[
    {
        "var_1": "some_string",
        "var_2a": {"foo": "x", "bar": "y"},
        "var_2b": {"foo": "x", "bar": "y"},
        "var_3": "another_string"        
    }
]

But fail this:

[
    {
        "var_1": "some_string",
        "var_2a": {"foo": 1},
        "var_2b": {"foo": true},
        "var_3": "another_string"        
    }
]
Sign up to request clarification or add additional context in comments.

3 Comments

Thank you, @DanielSchneider. This is a really helpful explanation. It seems like this should/would, but it is failing when it runs through this line of python: json_schema["properties"].items() with the error : "KeyError: 'properties'". Any idea what I should do to fix that?
The above data will validate fine against the schema if you use a tool like Draft7Validator from jsonschema -- how are you validating the data against the schema?
i think your response is correct, but the way the information is being past threw an aws step function unfortunately only allows for strict schemas. thanks for your help.

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.