0

Using cURL, I am navigating to a webpage. With the response from the cURL script, I essentially do the following

$dom = new DOMDocument();
$dom->loadHTML($response);

If I output $dom as expected I can see all of the html code for that page. Within the code, there is one specific section which is like the following

<script id="data" type="application/json">
<![CDATA[
{
    sortColumn: "QuoteNumber",
    quotes: {
        "Data":
        [
            {
                "ID":3235720,
                "Date":"20 May 2016",
                "QuoteNumber":"Q12415",
                "Name":"Some Name",
                "Client":"Some Client",
                "StateName":"Issued",
                "Url":"/Quote/View/3235720"
            }
        ]
    }
}
]]>
</script>

Is there any way I can target just this specific block of code? I essentially need to load the JSON and obtain the ID for the Quote. Would this be possible?

2
  • 2
    That piece of JSON is invalid. Did you copy it correctly into your question? There are three opening braces and only one closing brace. Also some properties (e.g. sortColumn) are not quoted, which is invalid in JSON. Commented Jun 10, 2016 at 11:14
  • Hi, thanks. There was a lot of JSON data and I removed a lot of it so it was simple for the question. I must have missed some braces. Thanks Commented Jun 10, 2016 at 11:36

1 Answer 1

2
  1. You can get the <script> tag using getElementById("data")
  2. Check the CDATA Node by comparing with the constant XML_CDATA_SECTION_NODE.
  3. Use str_replace() to remove the CDATA tag.
  4. Use json_decode to parse your content to JSON.

By the way, the content inside your CDATA is actually a malformed JSON. It should be corrected as described below:

<![CDATA[
{
    "sortColumn" : "QuoteNumber",
    "quotes": {
        "Data":
        [
            {
                "ID":3235720,
                "Date":"20 May 2016",
                "QuoteNumber":"Q12415",
                "Name":"Some Name",
                "Client":"Some Client",
                "StateName":"Issued",
                "Url":"/Quote/View/3235720"
            }
        ]
    }
}
]]>

I have also added has_json_error() function at the bottom so that you could see some error messages.

$dom = new DOMDocument();
$dom->loadHTML($response);
$data = $dom->getElementById("data");
$content = ''; 
foreach ($data->childNodes as $child) { 
    if ($child->nodeType == XML_CDATA_SECTION_NODE) {
        $content = $child->textContent;
    }
}
$content = str_replace(array("<![CDATA[", "]]>"), '', $content);
$jsons = json_decode($content);

if(!has_json_error()) {
    echo $jsons->sortColumn;
    echo "<br /><br />";
    print_r($jsons->quotes);
    echo "<br /><br />";
    $data = $jsons->quotes->Data;
    foreach($data as $obj) {
        echo $obj->ID . "<br />";
        echo $obj->Date . "<br />";
        echo $obj->QuoteNumber . "<br />";
        echo $obj->Name . "<br />";
        echo $obj->Client . "<br />";
        echo $obj->StateName . "<br />";
        echo $obj->Url . "<br />";
    }
}

function has_json_error() {
    if (function_exists ( 'json_last_error' ) && json_last_error() !== JSON_ERROR_NONE) {
        switch (json_last_error()) {
            case JSON_ERROR_DEPTH:
                echo 'JSON_ERROR: - Maximum stack depth exceeded';
            break;
            case JSON_ERROR_STATE_MISMATCH:
                echo 'JSON_ERROR: - Underflow or the modes mismatch';
            break;
            case JSON_ERROR_CTRL_CHAR:
                echo 'JSON_ERROR: - Unexpected control character found';
            break;
            case JSON_ERROR_SYNTAX:
                echo 'JSON_ERROR: - Syntax error, malformed JSON';
            break;
            case JSON_ERROR_UTF8:
                echo 'JSON_ERROR: - Malformed UTF-8 characters, possibly incorrectly encoded';
            break;
            default:
                echo 'JSON_ERROR: - Unknown error: ' . json_last_error();
            break;
        }           
        return true;
    }
    else if (function_exists ( 'json_last_error_msg' ) && json_last_error_msg () !== "No error") {
        echo ("json_last_error_msg, JSON_ERROR:" . json_last_error_msg ());
        return true;
    }
    return false;
}

The result from the snippet above would be something like below:

QuoteNumber

stdClass Object ( 
    [Data] => Array ( 
        [0] => stdClass Object ( 
            [ID] => 3235720 
            [Date] => 20 May 2016 
            [QuoteNumber] => Q12415 
            [Name] => Some Name 
            [Client] => Some Client 
            [StateName] => Issued 
            [Url] => /Quote/View/3235720 
        ) 
    ) 
) 

3235720
20 May 2016
Q12415
Some Name
Some Client
Issued
/Quote/View/3235720
Sign up to request clarification or add additional context in comments.

Comments

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.