0

I believe this may have been asked before many times, but i could not find a way make it working for json content. The result negative pattern is matching for all json strings (even if the substring exists). Im sure, i might be doing something wrong.

Idea is to match the json string which has no "key" string in it, and not match the one with "key" string in it.

Note: I do need to achieve this via "re.match" with negative regex (and not with matching it and negating in python), as im doing this in bulk with many expression, and cant really change the way of the function for one expression alone.

For example, below is my two json strings

'{"key": "success", "name": "peter"}'
'{"name": "sam"}'

And Im using the below regex pattern to negative match

((?!key).).*

Result is

Python 3.9.5 (default, May 11 2021, 08:20:37) 
[GCC 10.3.0] on linux
Type "help", "copyright", "credits" or "license" for more information.
>>> import re
>>> pattern = r"((?!key).).*"
>>> jsonstring = '{"key": "success", "name": "peter"}'
>>> re.match(pattern, jsonstring)
<re.Match object; span=(0, 35), match='{"key": "success", "name": "peter"}'>

>>> jsonstring = '{"name": "sam"}'
>>> re.match(pattern, jsonstring)
<re.Match object; span=(0, 15), match='{"name": "sam"}'>

Am I doing anything terribly wrong here? was trying different pattern, but without success so far.

1 Answer 1

1

((?!key).).* matches a positive sequence of characters ..* (that is equivalent to .+) which does not start with "key" (more precisely, the beginning must not be followed by the word "key"). Indeed both the strings do not start with the word "key", so both of them match the pattern. Notice that the brackets are useless here.

You may want to use (?!.*"key").*:

>>> import re
>>> pattern = r"(?!.*\"key\").*"
>>> jsonstring = '{"key": "success", "name": "peter"}'
>>>

>>> jsonstring = '{"name": "sam"}'
>>> re.match(pattern, jsonstring)
<re.Match object; span=(0, 15), match='{"name": "sam"}'>

which works in this case although it is not a good way of parsing a JSON string.

The best way is to use a JSON parser:

>>> import json
>>> jsonstring = '{"key": "success", "name": "peter"}'
>>> obj = json.loads(jsonstring)
>>> "key" not in obj
False
>>> jsonstring = '{"name": "sam"}'
>>> obj = json.loads(jsonstring)
>>> "key" not in obj
True
Sign up to request clarification or add additional context in comments.

3 Comments

Nice!! This looks good, and working perfect. Thanks!! I agree, this is not a most appropriate way for json content. Though im using some patterns inside a function, which will do for all contents, including text, csv and json. So sticking to something common consistent. Plus, the dictionary might be nested, and huge, and i may not know where exactly the substring exists.
@Danny Ok. Please notice that this will exclude also the strings in which "key" appears as a value (e.g. {"foo" : "key"}) or as array item (e.g. ["foo", "key"]). In order not to run into eventual false positives, you can use r"(?!.*\"key\"\s*:).*" instead.
Yes, I modified that a a bit to fit a bit global, and the answer gave me, what i really needed to start with. Thanks for that. And, oh yeh, I forgot accepting this as answer (just did!!) Thanks a heap

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.