1

I have a JSON file that I want to use PHP to replace the "Systems_x0020_Changed_IDs" value from a string to an array. "39122" becomes [39122] and "39223, 39244, 39395" becomes [39223, 39244, 39395]. I am using http://www.regexpal.com/ to test my expression. The expression is:

"([(0-9)+((, *))]+)+"

This is producing unexpected results in PHP. In my JSON file:

[{
        "ID": 1050436,
        "Title": "THE SKY IS FALLING!!!!",
        "Application_x0020_ID": 242,
        "Systems_x0020_Changed": "Academic Planning System (APS),\"Documents planning and evaluation processes at UGA that support cont",
        "Systems_x0020_Changed_IDs": "39122",
        "Status": "New",
        "Modified": "2015-10-28T16:14:45.573-04:00",
        "Age": 40,
        "Description_x0020__x0028_Public_x0029_": "I'm chicken little and the SKY IS FALLING!",
        "Impact_x0020__x0028_Public_x0029_": "The world is going to end!",
        "Start_x0020_Time": "2015-10-28T00:00:00-04:00",
        "End_x0020_Time": "2015-10-30T00:00:00-04:00",
        "Hours": 12
    }, {
        "ID": 1050740,
        "Title": "This is a Title",
        "Application_x0020_ID": 242,
        "Systems_x0020_Changed": "EITS Websites,\"EITS departmental web pages.\", GACRC Archival Storage,\"Archival Storage for Research Data\", VPS,\"Mainframe distributed printing system\"",
        "Systems_x0020_Changed_IDs": "39223, 39244, 39395",
        "Status": "New",
        "Modified": "2015-11-05T17:31:13.15-05:00",
        "Age": 32,
        "Description_x0020__x0028_Public_x0029_": "We will tell jokes to the clients",
        "Impact_x0020__x0028_Public_x0029_": "Everyone will notice the change.",
        "Start_x0020_Time": "2015-11-27T08:38:00-05:00",
        "End_x0020_Time": "2015-11-30T00:00:00-05:00",
        "Hours": 1
    }]

Several commas at the end of lines are being replaced with brackets[] so that the output looks like:

[{
    "ID": 1050436,
    "Title": "THE SKY IS FALLING!!!![,]Application_x0020_ID": 242,
    "Systems_x0020_Changed": "Academic Planning System (APS),\"Documents planning and evaluation processes at UGA that support cont[,]Systems_x0020_Changed_IDs": 39122,
    "Status": "New[,]Modified": "2015-10-28T16:14:45.573-04:00[,]Age": 40,
    "Description_x0020__x0028_Public_x0029_": "I'm chicken little and the SKY IS FALLING![,]Impact_x0020__x0028_Public_x0029_": "The world is going to end![,]Start_x0020_Time": "2015-10-28T00:00:00-04:00[,]End_x0020_Time": "2015-10-30T00:00:00-04:00[,]Hours": 12
}, {
    "ID": 1050740,
    "Title": "This is a Title[,]Application_x0020_ID": 242,
    "Systems_x0020_Changed": "EITS Websites,\"EITS departmental web pages.\", GACRC Archival Storage,\"Archival Storage for Research Data\", VPS,\"Mainframe distributed printing system\"[,]Systems_x0020_Changed_IDs": [39223, 39244, 39395],
    "Status": "New[,]Modified": "2015-11-05T17:31:13.15-05:00[,]Age": 32,
    "Description_x0020__x0028_Public_x0029_": "We will tell jokes to the clients[,]Impact_x0020__x0028_Public_x0029_": "Everyone will notice the change.[,]Start_x0020_Time": "2015-11-27T08:38:00-05:00[,]End_x0020_Time": "2015-11-30T00:00:00-05:00[,]Hours": 1
}]

My question is, how can I modify the expression so that PHP will behave like regexpal.com and only get the numbers within quotes and ignore the rest?

1
  • 1
    Modifying structured data like JSON with regexes is asking for trouble. I suggest reading it into a proper structure with json_decode, modifying the structure, and writing it back out with json_encode. Commented Dec 7, 2015 at 20:41

3 Answers 3

2

Your regex is rather strange, you appear to be trying to put a pattern expression inside a character class [...], which is probably not doing what you'd expect. Furthermore, your regex would match values inside other key/value pairs. Try this instead, which will only match values for the key "Systems_x0020_Changed_IDs":

"Systems_x0020_Changed_IDs":\s+"([^"]*)"
Sign up to request clarification or add additional context in comments.

2 Comments

@miken32 \d+ will fail on "Systems_x0020_Changed_IDs": "39223, 39244, 39395"
@miken32 - That might be true, but the post certainly doesn't say anything about wanting to match only digits, so I didn't make any assumption about what is in the field.
1

What about just parsing it as the JSON that it is?

$jsons = array('{
        "ID": 1050436,
        "Title": "THE SKY IS FALLING!!!!",
        "Application_x0020_ID": 242,
        "Systems_x0020_Changed": "Academic Planning System (APS),\"Documents planning and evaluation processes at UGA that support cont",
        "Systems_x0020_Changed_IDs": "39122",
        "Status": "New",
        "Modified": "2015-10-28T16:14:45.573-04:00",
        "Age": 40,
        "Description_x0020__x0028_Public_x0029_": "I\'m chicken little and the SKY IS FALLING!",
        "Impact_x0020__x0028_Public_x0029_": "The world is going to end!",
        "Start_x0020_Time": "2015-10-28T00:00:00-04:00",
        "End_x0020_Time": "2015-10-30T00:00:00-04:00",
        "Hours": 12
    }', '{
        "ID": 1050740,
        "Title": "This is a Title",
        "Application_x0020_ID": 242,
        "Systems_x0020_Changed": "EITS Websites,\"EITS departmental web pages.\", GACRC Archival Storage,\"Archival Storage for Research Data\", VPS,\"Mainframe distributed printing system\"",
        "Systems_x0020_Changed_IDs": "39223, 39244, 39395",
        "Status": "New",
        "Modified": "2015-11-05T17:31:13.15-05:00",
        "Age": 32,
        "Description_x0020__x0028_Public_x0029_": "We will tell jokes to the clients",
        "Impact_x0020__x0028_Public_x0029_": "Everyone will notice the change.",
        "Start_x0020_Time": "2015-11-27T08:38:00-05:00",
        "End_x0020_Time": "2015-11-30T00:00:00-05:00",
        "Hours": 1
    }');
foreach($jsons as $json){
     $json_array = json_decode($json, true);
     echo $json_array['Systems_x0020_Changed_IDs'] . "\n";
}

Demo: https://eval.in/481865

If you needed a regex you could do something like:

"Systems_x0020_Changed_IDs":\h*"(([\d+],?\h*)*)"

Demo: https://regex101.com/r/yZ6eM3/1

PHP Usage:

$string = '{
        "ID": 1050436,
        "Title": "THE SKY IS FALLING!!!!",
        "Application_x0020_ID": 242,
        "Systems_x0020_Changed": "Academic Planning System (APS),\"Documents planning and evaluation processes at UGA that support cont",
        "Systems_x0020_Changed_IDs": "39122",
        "Status": "New",
        "Modified": "2015-10-28T16:14:45.573-04:00",
        "Age": 40,
        "Description_x0020__x0028_Public_x0029_": "I\'m chicken little and the SKY IS FALLING!",
        "Impact_x0020__x0028_Public_x0029_": "The world is going to end!",
        "Start_x0020_Time": "2015-10-28T00:00:00-04:00",
        "End_x0020_Time": "2015-10-30T00:00:00-04:00",
        "Hours": 12
    }, {
        "ID": 1050740,
        "Title": "This is a Title",
        "Application_x0020_ID": 242,
        "Systems_x0020_Changed": "EITS Websites,\"EITS departmental web pages.\", GACRC Archival Storage,\"Archival Storage for Research Data\", VPS,\"Mainframe distributed printing system\"",
        "Systems_x0020_Changed_IDs": "39223, 39244, 39395",
        "Status": "New",
        "Modified": "2015-11-05T17:31:13.15-05:00",
        "Age": 32,
        "Description_x0020__x0028_Public_x0029_": "We will tell jokes to the clients",
        "Impact_x0020__x0028_Public_x0029_": "Everyone will notice the change.",
        "Start_x0020_Time": "2015-11-27T08:38:00-05:00",
        "End_x0020_Time": "2015-11-30T00:00:00-05:00",
        "Hours": 1
    }';
$regex = '/"Systems_x0020_Changed_IDs":\h*"((?:[\d+],?\h*)*)"/';
preg_match_all($regex, $string, $matches);
print_r($matches[1]);

Output:

Array
(
    [0] => 39122
    [1] => 39223, 39244, 39395
)

Demo #2: https://eval.in/481871

Comments

0

The answer I was looking for is:

$str = preg_replace('/"((\d+[, ]*)+)"/', "[$1]", $str);

I needed the JSON file as is except for number values as strings. My regex worked after I played with it a little more.

3 Comments

Isn't that the regex I gave in the second part of my answer, just less strict? (([\d+],?\h*)*)
It may be. I didn't even pay attention because you added Systems_x0020_Changed_IDs to the front of it, which is not something I wanted to do. But thanks for the help anyway!
Now that I am looking at your regex, I see that it may be a better solution. Thanks for the help!

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.