Given the following json string: {"key":"val"ue","other":"invalid ""quo"te"}
I want to capture each illegal double quote inside the values. In the example there is one double quote in the value of the key property and there are three double quotes in the property called other.
I've seen multiple comments noting that this is invalid json (correct) and that the supplied json should be valid before receiving. However this is not possible in my case.
Assuming that this would only occur in the values and not in keys I think it's safe to assume that a starting sequence would be a colon followed by a double quote. An ending sequence would be a double quote followed by comma OR closing curly brace.
I've tried the following regex (among many other versions) which is the closest so my desired solution:
/:\s?".*?(").*?[,}]/i
This correctly captures the one double quote in the key property, but only captures the first double quote in the 'other' property. I would like it to capture the other two double quotes as well as a separate capture.
Another regex I've tried: /:\s?".*?("{1,})[^,}].*?[,}]/i
This does the same as the first regex, but captures the two double quotes in one capture (not preferable)
My goal ultimately is to capture each double quote separately, so four captures. What I think I need in order to accomplish this is a way to make the capture group 'greedy?' so that it doesn't stop at the first double quote.
How could I achieve this?
I am using the following PHP code to test the Regex:
$text = '{"key":"val"ue","other":"invalid ""quo"te"}';
$pattern = '/:\s?".*?(").*?[,}]/i';
preg_match_all($pattern, $text, $matches, PREG_OFFSET_CAPTURE);
echo '<pre>' . print_r($matches, true) . '</pre>';
}to[:}]at first glance it looks like it will cover this, not sure (updated regex).