2

Consider that we have a file called configuration.js, and when we look inside we see:

'use strict';
var profile = {
    "project": "%ProjectsRoot%\\SampleProject\\Site\\Site.csproj",
    "projectsRootKey": "%ProjectsRoot%",
    "ftp": {
        "address": "ftp://192.168.40.50/",
        "username": "",
        "password": ""
    },
    "delete": [
        "\\b(bin)\\b.*\\.config",
        "\\b(bin)\\b.*\\.js",
        "\\b(bin)\\b.*\\.css",
        "bin\\\\(?!ProjectName).*\\.(dll|pdb)"
    ],
    "replace": [
        {
            "file": "Web.config",
            "items": [
                {
                    "regex": "(<appSettings file=\")(bin\\\\)(Settings.config\">)",
                    "newValue": "$1$3"
                },
                {
                    "regex": "<remove\\s*segment=.bin.\\s/>",
                    "newValue": ""
                }
            ]
        }
    ]
};

In this case, the content of .js file is intended to be only JSON, yet for some IDE reasons it's stated as a JavaScript statement so that IDE recognizes the content and formats it correctly. This file might in another scenario contain:

{
  "project": "%ProjectsRoot%\\SampleProject\\Site\\Site.csproj",
  "projectsRootKey": "%ProjectsRoot%",
  "ftp": {
    "address": "ftp://192.168.40.50/",
    "username": "",
    "password": ""
  },
  "delete": [
    "\\b(bin)\\b.*\\.config",
    "\\b(bin)\\b.*\\.js",
    "\\b(bin)\\b.*\\.css",
    "bin\\\\(?!ProjectName).*\\.(dll|pdb)"
  ],
  "replace": [
    {
      "file": "Web.config",
      "items": [
        {
          "regex": "(<appSettings file=\")(bin\\\\)(Settings.config\">)",
          "newValue": "$1$3"
        },
        {
          "regex": "<remove\\s*segment=.bin.\\s/>",
          "newValue": ""
        }
      ]
    }
  ]
}

In both cases, the extension of files are better to be .json, rather than to be .js. We're creating a quality tool that has many features, one of which is to suggest to the developer to change file's extension based on content.

In both cases, how can we make sure that the file only contains JSON, or is INTENDED to only contain JSON?

Note: the reason for complex JSON here as example is to bring forward a real-word sample.

4
  • 2
    the first snippet is definitely javascript, not JSON Commented Aug 3, 2016 at 12:23
  • you'll have to search for patterns in the content of the file, you have to define what makes a json file valid and search for it... try searching for: {"...":"..."} excluding spaces, end of lines... Commented Aug 3, 2016 at 12:23
  • Your first example contains only JS object (also second one may be JS object, just without direct assignment). If you include file for use in JS, than you can't access second example in any way. Also what you mean intended to be JSON? Commented Aug 3, 2016 at 12:23
  • Very difficult one. I guess the real question is what makes JSON not Javascript. I would suggest the lack of functionality. It makes no function calls and declares no function bodies. I would therefore be looking for some form of Regex to find 'function' definitions or members that place calls. The latter is obviously the harder of the two. Intrigued to see what answer people propose! Commented Aug 3, 2016 at 12:24

2 Answers 2

1

To cover the second case, all you would need to do would be to feed the file to some JSON parser with very strict settings; if it rejects the file, then it won't be a JSON file.

To cover the first one, well, as long as you're only trying to validate that very specific case, one possibility would be to use some regex to remove both the semicolon at the end and the 'use strict'; var something = at the start, and then pass the resulting cleaned up text through a JSON parser to see if it is valid JSON.

If you need to handle more complex cases, you could use some JavaScript parser to generate an AST from the file, and then walk through the tree to validate it (say, if it contains a single variable, no functions, no statements, etc). Of course, that would be slightly more complex, though very powerful.

var STRICT_JSON_EXAMPLE = '{"value": "ok"}';
var JSON_LIKE_EXAMPLE = '\'use strict\';\nvar somevar = {"value": "ok"};';
var NON_JSON_EXAMPLE = 'alert("!!!");';

var EXAMPLES = [ STRICT_JSON_EXAMPLE, JSON_LIKE_EXAMPLE, NON_JSON_EXAMPLE ];

function isStrictJSON(text) {
  try {
JSON.parse(text);
return true;
  } catch (e) {
return false;
  }
}

function isJSONLike(text) {
  var regex = /^\s*(['"]use strict['"]\s*;?)?\s*var\s+\w+\s*=\s*(.*?);?$/;
  var cleanedText = text.replace(regex, '$2');
  return isStrictJSON(cleanedText);
}

alert('Strict JSON: ' + EXAMPLES.map(isStrictJSON).join(', ') +
 '\nJSON-like: ' + EXAMPLES.map(isJSONLike).join(', '));

Sign up to request clarification or add additional context in comments.

Comments

0

You'll have to search for patterns in the content of the file, you have to define what makes a json file valid and search for it... try searching for: {"...":"..."} excluding spaces, end of lines... I use to do something like that in a Word + C# tool created to edit contracts, and after a while, the team noticed that the pattern recognition was the to go.

My suggestion is to create patterns for the different files and suggest the files that got the most coincidences or if you have to the one file type that got the most matches...

Comments

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.