0

I need to parse the following json output, so I can take parse the Title entries

[{"Title":"000webhost","Name":"000webhost","Domain":"000webhost.com","BreachDate":"2015-03-01","AddedDate":"2015-10-26T23:35:45Z","ModifiedDate":"2015-10-26T23:35:45Z","PwnCount":13545468,"Description":"In approximately March 2015, the free web hosting provider <a href=\"http://www.troyhunt.com/2015/10/breaches-traders-plain-text-passwords.html\" target=\"_blank\" rel=\"noopener\">000webhost suffered a major data breach</a> that exposed over 13 million customer records. The data was sold and traded before 000webhost was alerted in October. The breach included names, email addresses and plain text passwords.","DataClasses":["Email addresses","IP addresses","Names","Passwords"],"IsVerified":true,"IsFabricated":false,"IsSensitive":false,"IsActive":true,"IsRetired":false,"IsSpamList":false,"LogoType":"png"},{"Title":"Lifeboat","Name":"Lifeboat","Domain":"lbsg.net","BreachDate":"2016-01-01","AddedDate":"2016-04-25T21:51:50Z","ModifiedDate":"2016-04-25T21:51:50Z","PwnCount":7089395,"Description":"In January 2016, the Minecraft community known as Lifeboat <a href=\"https://motherboard.vice.com/read/another-day-another-hack-7-million-emails-and-hashed-passwords-for-minecraft\" target=\"_blank\" rel=\"noopener\">was hacked and more than 7 million accounts leaked</a>. Lifeboat knew of the incident for three months before the breach was made public but elected not to advise customers. The leaked data included usernames, email addresses and passwords stored as straight MD5 hashes.","DataClasses":["Email addresses","Passwords","Usernames"],"IsVerified":true,"IsFabricated":false,"IsSensitive":false,"IsActive":true,"IsRetired":false,"IsSpamList":false,"LogoType":"svg"}]

To parse, this I use the following code :

cat $myfile | python -c "import sys, json; print json.load(sys.stdin)[0]['Title']"

But this results in the output :

000webhost

whereas I need the output to be :

000webhost

Lifeboat

3
  • why should the output be that ? Commented Dec 6, 2017 at 6:54
  • @JoshHamet They already posted that. Commented Dec 6, 2017 at 6:54
  • Why do you want to do this in a command line rather than using a proper script? Commented Dec 6, 2017 at 7:22

2 Answers 2

2

If you want to display all the titles you need to loop over the items in the array. Currently you're asking for the first item [0].

You can do this as a one-liner using a comprehension to extract the titles:

[item['Title'] for item in json.load(sys.stdin)]

And then a loop to print out each title on a separate line:

for title in [item['Title'] for item in json.load(sys.stdin)]: print title

So the complete command line script would be:

cat $myfile | python -c "import sys, json; for title in [item['Title'] for item in json.load(sys.stdin)]: print title"
Sign up to request clarification or add additional context in comments.

1 Comment

@NavanChauhan How can help you if you don't tell me what the syntax error is?
0

You really should be doing this with a proper script. Also, that's a superfluous use of cat, and you should put Bash parameter expansions inside double-quotes to prevent word-splitting. You can omit the quotes if you're sure the path doesn't contain spaces, but it's really not a good habit to get into.

Anyway, this code works in both Python 2 and Python 3.

python -c "import sys,json;print('\n'.join([u['Title']for u in json.load(open(sys.argv[1]))]))" "$myfile"

output

000webhost
Lifeboat

Here's how to write it as a proper script.

import sys
import json

with open(sys.argv[1]) as f:
    data = json.load(f)
print('\n'.join([u['Title'] for u in data]))

5 Comments

I know I should do it with a proper script, but this is for a bash program
@NavanChauhan Fair enough. I recommend you use Shellcheck to make your Bash scripts more robust. And you may find the BashGuide useful.
What change should I bring in it, so it parses only the top 3 out of numerous results ?
@NavanChauhan A simple way to do that is to slice the list returned by json.load, eg print('\n'.join([u['Title']for u in json.load(open(sys.argv[1]))[:3]])).
Thanks a ton mate

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.