0

I have a long list of JSON data, with repeats of contents similar to followings.

Due to the original JSON file is too long, I will just shared the hyperlinks here. This is a result generated from a database called RegulomeDB.

Direct link to the JSON file

I would like to extract specific data (eQTLs) from "method": "eQTLs" and "value": "xxxx", and put them into 2 columns (tab delimited) exactly like below. Note: "value":"xxxx" is extracted right after "method": "eQTLs"is detected.

eQTLs   firstResult, secondResult, thirdResult, ...

In this example, the desired output is:

eQTLs   EIF3S8, EIF3CL

I've tried using a python script but was unsuccessful.

import json
with open('file.json') as f:
    f_json = json.load(f)
    print 'f_json[0]['"method": "eQTLs"'] + "\t" + f_json[0]["value"]

Thank you for your kind help.

3
  • Do you have a preferred language for doing this? Commented Nov 8, 2022 at 14:36
  • Hi @NickODell, no I don't. But bash would be good. Commented Nov 8, 2022 at 14:38
  • Double request with bioinformatics.stackexchange.com/questions/19978/… Commented Nov 9, 2022 at 10:01

2 Answers 2

1

Maybe you'll find the JSON-parser useful. It can open urls and can manipulate strings any way you want:

$ xidel -s "https://regulomedb.org/regulome-search/?regions=chr16:28539847-28539848&genome=GRCh37&format=json" \
  -e '"eQTLs	"||join($json("@graph")()[method="eQTLs"]/value,", ")'
eQTLs   EIF3S8, EIF3CL

Or with the XPath/XQuery 3.1 syntax:

-e '"eQTLs	"||join($json?"@graph"?*[method="eQTLs"]?value,", ")'
Sign up to request clarification or add additional context in comments.

Comments

0

Try this:

cat file.json | grep -iE '"method":\s*"eQTLs"[^}]*' -o | cut -d ',' -f 1,5 | sed -r 's/"|:|method|value//gi' | sed 's/\s*eqtls,\s*//gi' | tr '\n' ',' | sed 's/,$/\n/g' | sed 's/,/, /g' | xargs echo -e 'eQTLs\x09'

6 Comments

Hi @SaSkY, thank you for trying. However, I am getting errors as followings. grep: invalid option -- P usage: grep [-abcdDEFGHhIiJLlMmnOopqRSsUVvwXxZz] [-A num] [-B num] [-C[num]] [-e pattern] [-f file] [--binary-files=value] [--color=when] [--context[=num]] [--directories=action] [--label] [--line-buffered] [--null] [pattern] [file ...]
@austin7923 I updated the answer please try the command again
Thanks! It works! But it would be good if the final output could be glued together with a "comma", exactly like the one shown in the post.
@austin7923 I updated the answer again, can you try it and tell me if it works as you expected ?
Hi @SaSkY, Some edits need to be done on your command. It works flawlessly now. Thank you! cat file.json | grep -iE '"method":\s*"eQTLs"[^}]*' -o | cut -d ',' -f 1,5 | sed -r 's/"|:|method|value//gi' | sed 's/\s*eqtls,\s*//gi' | tr '\n' ',' | sed 's/,$/\n/g' | sed 's/,/, /g' | xargs echo 'eQTLs
|

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.