1

I have a JSON dataset of papers and where they were published, like such:

{"id": "1", "title": "Paper1", "venue": {"raw": "Journal of Cell Biology"}}
{"id": "2", "title": "Paper2", "venue": {"raw": "Nature"}}
{"id": "3", "title": "Paper3", "venue": {"raw": "Journal of Histochemistry and Cytochemistry"}}

I want to create nodes for only the papers published in a certain journal, say Nature, and add a relationship between the paper node and an existing journal node. That is, I want to create nodes for only the lines of data with a certain value for the venue.raw key.

The code I am working with is below. I think I need to add some logic to the apoc.load.json part so that it only matches data where $.venue.raw == 'Nature':

CALL apoc.load.json('file:/example.txt', '$.venue.raw') YIELD value AS q 
CREATE (p:Quanta {id:q.id, title:q.title})
WITH q, p
UNWIND q.venue as venue
MATCH (v:Venue {name: venue.raw})
CREATE (p)-[:PUBLISHED_IN_VENUE]->(v)

Is there a way that I can alter this so to import only the relevant data?

Any help would be greatly appreciated!

2
  • Do you have loaded Venues in the database? Commented Mar 1, 2019 at 6:58
  • Yes, the venues are in the database already Commented Mar 1, 2019 at 23:17

1 Answer 1

1

I assume venues are present in your database, If it's not load it first or change the query to CREATE/MERGE it.

This query will filter based on the value provided, You can later change this to accept venue as a parameter.

CALL apoc.load.json('file:/example.txt') YIELD value AS q 
WHERE q.venue.raw="Nature"
CREATE (p:Quanta {id:q.id, title:q.title})
WITH p,q
MATCH (v:Venue {name: q.venue.raw})
CREATE (p)-[:PUBLISHED_IN_VENUE]->(v)
Sign up to request clarification or add additional context in comments.

2 Comments

This still loads all the data into Cypher instead of only loading the relevant lines, correct? If so, this is going to be slower than doing the JSON selection with the apoc.load.json method I believe
It will load all the data in memory and Create the nodes only if The Condition is true. Nodes are not created is venue is not Nature. Currently it's not possible to filter data as you are asking. You may need to filter this outside.

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.