2

I have a json and I want to grep the website url (http://mywebsite.com), how do i grep that using shell script.

P.S: I know there are tools like 'jq' which could make it easier but I want to do it using sed/awk/grep utilities.

eg: test.json

{
  "name"       : "xyz", 
  "age"        : "25",
  "websiteurl" : "http://mywebsite.com" 
}

So far I have tried;

cat test.json | grep -i website* | cut -d ':' -f2

Output:

"http

But when I run the above command as shown above, it also seperates the colon (:) between http and double slash(//) which I dont want. I want the whole url to be stored in a variable.

4
  • 1
    You don't want to use sed/awk/grep to parse json. You do in fact want to use jq. While you may indeed find people willing to provide answers that help you do this the wrong way, it won't make it any less the wrong way to do this. Commented May 30, 2017 at 2:01
  • Oddly, the question that this is a duplicate of has almost identical JSON. I wonder if they're part of the same course. @skyrocker, can you tell us where this came from? Commented May 30, 2017 at 2:05
  • @ghoti You are correct. I was referring to the same post that you have talked above in your comment as it appeared to be the first response to my google search Commented May 30, 2017 at 2:21
  • @ghoti however may be the json is similar but my question is different from stackoverflow.com/questions/38364261/…. Commented May 30, 2017 at 2:23

3 Answers 3

3

Well, if you are going to do it wrong (like not using jq), at least do it less wrong

awk '/website/ {gsub("\"", "", $3); print $3}' test.json

Explanation

awk splits the input into fields, so here $3 is the 3rd field (1 based) for lines matching website. Then quotes are removed (if present) and result printed.

Sign up to request clarification or add additional context in comments.

2 Comments

can you please explain me how $3 works in above example?
There are three whitespace-separated tokens in your example (the second is the lone :); this takes the third one, and discards double quotes around (and actually also within) it.
0

If jq were an option, the solution would be as simple as:

$ jq .websiteurl < example.json
"http://mywebsite.com"

If jq cannot be made available in your environment, and you want a solution in bash alone, JSON.sh should do the trick:

$ curl -s -O https://raw.githubusercontent.com/dominictarr/JSON.sh/master/JSON.sh
$ declare -A result=()
$ while IFS=$'\t' read -r key value; do eval result$key="$value"; done < <(sh JSON.sh -n < ex.json)
$ declare -p result
declare -A result=([websiteurl]="http://mywebsite.com" [name]="xyz" [age]="25" )
$ printf '%s\n' "${result["websiteurl"]}"
http://mywebsite.com

This isn't particularly good, but it worked in the test I just did. the usage above will fail if $value (the data of any part of your json input) contains a tab.

JSON.sh should work in any POSIX shell, including bash, and contains no external dependencies.

Also note that declare -A (associative arrays) requires bash version 4 or above.

Comments

0

Why don't you use quotation mark as awk seperator?

Example:

cat test.json | grep -i website* | awk -F '"' '{print $4}'

This should work.

3 Comments

Though you probably don't actually want to search for websit, which is what this regex effectively does; and you could inline the grep into the Awk script (though lowercasing in Awk is quite a bit more verbose).
... And of course, the cat is useless.
You may be correct if we want to optimize the one-liner as much as possible, but I think this way is more readable and tracable for beginners. Thanks for the explanation anyway.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.