3

I have a file with the return of a curl statement in it, in the form of json. Each object has a set of values, but the parameters for these values are all called the same names. See code below.

These objects are part of a larger object called workflow. The Cleaning up object is the last process that runs in our workflow. For every video that passes through the workflow, a json file in this format is created. (There are more than only these three objects, this is just for illustrative purposes)

I want to take the value of completed of the object with "description": "Cleaning up" and store it as a variable $end_time. Then I want to take the value of completed of the object with "description": "Ingest" and store it as a variable $start_time. These two values are then subtracted to give me an integer time in milliseconds so I can calculate the time it took for the video to go through this part of the process. With the maths part I am fine, and know how to do it. It is the extraction of the values that I am struggling with.

I hope this makes sense? ANY help would be appreciated. Thank you in advance!

EDIT: Had to delete original code in post, due to character limitations

Here is a proper example of the file that I have to work with:

{
    "workflows": {
        "count": "20", 
        "searchTime": "1", 
        "startPage": "0", 
        "totalCount": "1", 
        "workflow": {
            "configurations": {
                "configuration": [
                    {
                        "$": "1409750880000", 
                        "key": "schedule.start"
                    }, 
                    {
                        "$": "1409755980000", 
                        "key": "schedule.stop"
                    }, 
                    {
                        "$": "Capture_agent", 
                        "key": "schedule.location"
                    }, 
                    {
                        "$": "false", 
                        "key": "trimHold"
                    }, 
                    {
                        "$": "true", 
                        "key": "archiveOp"
                    }, 
                    {
                        "$": "false", 
                        "key": "captionHold"
                    }, 
                    {
                        "$": "false", 
                        "key": "videoPreview"
                    }
                ]
            }, 
            "creator": {
                "organization": "mh_default_org", 
                "roles": [
                    "76b1bdde-a080-40a4-b929-bde89af6a0a8_Instructor", 
                    "ROLE_ADMIN", 
                    "ROLE_ANONYMOUS", 
                    "ROLE_USER"
                ], 
                "userName": user_name
            }, 
            "description": "This workflow definition defines the steps involved in scheduling a recording, capturing it, and\n    ingesting it, after which processing operations may be added.\n  ", 
            "errors": "", 
            "id": "15518", 
            "mediapackage": {
                "attachments": "", 
                "creators": {
                    "creator": "Name"
                }, 
                "id": "2d25ed19-2978-458d-a4a0-c9c56d791c68", 
                "license": "Creative Commons 3.0: Attribution-NonCommercial-NoDerivs", 
                "media": "", 
                "metadata": "", 
                "publications": {
                    "publication": {
                        "channel": "engage-player", 
                        "id": "b7b68f91-2c33-4673-ba7c-2e9b891788f9", 
                        "mimetype": "text/html", 
                        "tags": "", 
                        "url": "http://some.url.com:80/engage/ui/watch.html?id=2d25ed19-2978-458d-a4a0-c9c56d791c68"
                    }
                }, 
                "series": "76b1bdde-a080-40a4-b929-bde89af6a0a8", 
                "seriestitle": "Recording_Title_user_name", 
                "start": "2014-09-03T13:28:00Z", 
                "title": "Recording_Title"
            }, 
            "operations": {
                "operation": [
                    {
                        "abortable": "false", 
                        "completed": 1409750882092, 
                        "configurations": {
                            "configuration": [
                                {
                                    "$": "1409750880000", 
                                    "key": "schedule.start"
                                }, 
                                {
                                    "$": "1409755980000", 
                                    "key": "schedule.stop"
                                }, 
                                {
                                    "$": "Capture_agent", 
                                    "key": "schedule.location"
                                }
                            ]
                        }, 
                        "continuable": "false", 
                        "description": "Scheduled", 
                        "execution-history": "", 
                        "execution-host": "http://some.url.com:8080", 
                        "fail-on-error": "true", 
                        "failed-attempts": "0", 
                        "hold-action-title": "View schedule", 
                        "holdurl": "/workflow/hold/org.opencastproject.workflow.handler.scheduleworkflowoperationhandler", 
                        "id": "schedule", 
                        "job": "15519", 
                        "max-attempts": "1", 
                        "retry-strategy": "none", 
                        "started": 1409750881745, 
                        "state": "SUCCEEDED", 
                        "time-in-queue": 0
                    }, 
                    {
                        "abortable": "false", 
                        "configurations": "", 
                        "continuable": "false", 
                        "description": "Capture", 
                        "execution-history": "", 
                        "execution-host": "http://some.url.com:8080", 
                        "fail-on-error": "true", 
                        "failed-attempts": "0", 
                        "hold-action-title": "Monitor capture", 
                        "holdurl": "/workflow/hold/org.opencastproject.workflow.handler.captureworkflowoperationhandler", 
                        "id": "capture", 
                        "job": "42894", 
                        "max-attempts": "1", 
                        "retry-strategy": "none", 
                        "started": 1409750884085, 
                        "state": "SKIPPED", 
                        "time-in-queue": 0
                    }, 
                    {
                        "completed": 1409756171224, 
                        "configurations": "", 
                        "description": "Ingest", 
                        "execution-history": "", 
                        "fail-on-error": "true", 
                        "failed-attempts": "0", 
                        "id": "ingest", 
                        "max-attempts": "1", 
                        "retry-strategy": "none", 
                        "state": "SUCCEEDED"
                    },                     
                    {
                        "completed": 1409854379552, 
                        "configurations": {
                            "configuration": {
                                "key": "preserve-flavors"
                            }
                        }, 
                        "description": "Cleaning up", 
                        "execution-history": "", 
                        "execution-host": "http://some.url.com:8080", 
                        "fail-on-error": "false", 
                        "failed-attempts": "0", 
                        "id": "cleanup", 
                        "job": "45113", 
                        "max-attempts": "1", 
                        "retry-strategy": "none", 
                        "started": 1409854378128, 
                        "state": "SUCCEEDED", 
                        "time-in-queue": 0
                    }
                ]
            }, 
            "organization": {
                "adminRole": "ROLE_ADMIN", 
                "anonymousRole": "ROLE_ANONYMOUS", 
                "id": "mh_default_org", 
                "name": "Opencast Project", 
                "properties": {
                    "property": [
                        {
                            "$": "true", 
                            "key": "adminui.i18n_tab_episode.enable"
                        }, 
                        {
                            "$": "false", 
                            "key": "adminui.i18n_tab_users.enable"
                        }, 
                        {
                            "$": "/engage/ui/img/mh_logos/OpencastLogo.png", 
                            "key": "logo_small"
                        }, 
                        {
                            "$": "http://opencast.org/matterhorn/", 
                            "key": "engageui.link_mobile_redirect.url"
                        }, 
                        {
                            "$": "false", 
                            "key": "engageui.annotations.enable"
                        }, 
                        {
                            "$": "true", 
                            "key": "engageui.links_media_module.enable"
                        }, 
                        {
                            "$": "2024", 
                            "key": "adminui.chunksize"
                        }, 
                        {
                            "$": "false", 
                            "key": "adminui.series_prepopulate.enable"
                        }, 
                        {
                            "$": "true", 
                            "key": "engageui.link_download.enable"
                        }, 
                        {
                            "$": "false", 
                            "key": "engageui.link_mobile_redirect.enable"
                        }, 
                        {
                            "$": "For more information have a look at the official site.", 
                            "key": "engageui.link_mobile_redirect.description"
                        }, 
                        {
                            "$": "/engage/ui/img/mh_logos/MatterhornLogo_large.png", 
                            "key": "logo_large"
                        }
                    ]
                }, 
                "servers": {
                    "server": {
                        "name": "localhost", 
                        "port": "8080"
                    }
                }
            }, 
            "parent": {
                "nil": "true"
            }, 
            "state": "SUCCEEDED", 
            "template": "full", 
            "title": "Scheduled Workflow"
        }
    }
}
3
  • 3
    try json parser rather than awk or sed. Commented Sep 12, 2014 at 14:17
  • Awk and sed are used for regular expressions, and as you're finding out, you can't parse JSON structures with regular expressions. For more complex structures like XML and JSON, you'll need to use Python or Perl which have modules that can handle these data structures. Commented Sep 12, 2014 at 14:22
  • 1
    Look into jq to help with this. Commented Sep 12, 2014 at 14:33

1 Answer 1

1

Here is a jq example that should point you to getting what you want:

#!/bin/bash
# Assuming the json is in a file workflow.json
end_time=$( jq '.workflows.workflow.operations.operation[] | select(.description == "Cleaning up") | .completed' < workflow.json )
start_time=$( jq '.workflows.workflow.operations.operation[] | select(.description == "Ingest") | .completed' < workflow.json )

This is assuming the input you have is in an JSON array called workflow at the top level. Here's this on the command line:

$ jq '.workflows.workflow.operations.operation[] | select(.description == "Ingest") | .completed' < workflow.json
1406051539118
$ jq '.workflows.workflow.operations.operation[] | select(.description == "Cleaning up") | .completed' < workflow.json
1406051695440
Sign up to request clarification or add additional context in comments.

7 Comments

Thank you very much. I will test this tomorrow and let you know! Thank you very very much!
I am struggling a bit here. I have tried your commands, but I keep getting errors. The command: end_time=$( jq '.workflow[] | select(.description == "Cleaning up") | .completed' < json-result.txt ) The Error: jq: error: Cannot iterate over null I have added the complete json file in the original post as there are not enough characters left in this comment.
I updated the answer based on the new JSON, but I didn't see "Extracting text..." in the example. If you look for this, you will get a blank result. However, the "Cleaning up" example works with the JSON you supplied. The thing to remember is the first part is the array you want to look in, and the original example didn't have the whole path to that. I would try the jq command on the command line first to validate that it gives you what you want.
Thank you! I do apologize for the confusion! Still learning as I go along as to what to post and what not! So sorry again! I will give this a try tomorrow and get back to you! Thank you for your time and effort!
This works brilliantly on this sample file! Thank you! Saw the errors I made, and that is fixed. This file was a sample of a much larger json file. Whenever I run this command on the larger file which contains multiple instances of "workflow" within the parent "workflows", I get the following error: jq: error: Cannot index array with string. A google search lead me to a post which was of no help. I know I somehow have to tweak this command, but cannot figure out how? Sorry if this sounds stupid.
|

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.