0

I have a csv file:

Year , To , From , Number 
2005 , A  , G    , 10
2005 , B  , E    , 20
2005 , A  , F    , 30
2006 , C  , D    , 40
2006 , D  , F    , 50

I am expecting to get:

[  
  {  
    'graph':{  
      'links':[  
        {  
          'target':'A',
          'value':10,
          'source':'G'
        },
        {  
          'target':'B',
          'value':20,
          'source':'E'
        },
        {  
          'target':'A',
          'value':30,
          'source':'F'
        }
      ],
      'nodes':[  
        {  
          'name':'A',
          'node':index
        },
        {  
          'name':'B',
          'node':index
        },
        {  
          'name':'E',
          'node':index
        },
        {  
          'name':'F',
          'node':index
        },
        {  
          'name':'G',
          'node':index
        }
      ]
    },
    'year':2005
  },
  {  
    'graph':{  
      'links':[  
        {  
          'target':'C',
          'value':40,
          'source':'D'
        },
        {  
          'target':'D',
          'value':50,
          'source':'F'
        }
      ],
      'nodes':[  
        {  
          'name':'C',
          'node':index
        },
        {  
          'name':'D',
          'node':index
        },
        {  
          'name':'F',
          'node':index
        }
      ]
    },
    'year':2006
  }
]

I tried implementing the code below:

import pandas as pd
df = pd.read_csv('test.csv')
dict = []
link = {
    "source":"",
    "target":"",
    "value":""
}
node1 = {
    "node": "",
    "name": ""
}
node2 = {
    "node": "",
    "name": ""
}
collection = {"year":"", "graph":{
    "nodes":[],
    "links":[]
}}
gd = df.groupby("Year")
for name, group in gd:
    group = group.reset_index(drop=True)
    newC["year"] = group["Year"]
    for i in range(0, len(group)):
        node1["name"] = group["To"][i]
        node2["name"] = group["From"][i]
        link["value"] = group["Number"][i]
        link["source"] = group["From"][i]
        link["target"] = group["To"][i]
        collection["graph"]["nodes"].append(node1)
        collection["graph"]["nodes"].append(node2)
        collection["graph"]["links"].append(link)
    dict.append(collection)
print(dict)

But this is the output of the collection:

{  
  'graph':{  
    'links':[  
      {  
        'target':'D',
        'value':50,
        'source':'F'
      },
      {  
        'target':'D',
        'value':50,
        'source':'F'
      },
      {  
        'target':'D',
        'value':50,
        'source':'F'
      },
      {  
        'target':'D',
        'value':50,
        'source':'F'
      },
      {  
        'target':'D',
        'value':50,
        'source':'F'
      },
      {  
        'target':'D',
        'value':50,
        'source':'F'
      },
      {  
        'target':'D',
        'value':50,
        'source':'F'
      },
      {  
        'target':'D',
        'value':50,
        'source':'F'
      },
      {  
        'target':'D',
        'value':50,
        'source':'F'
      },
      {  
        'target':'D',
        'value':50,
        'source':'F'
      },
      {  
        'target':'D',
        'value':50,
        'source':'F'
      },
      {  
        'target':'D',
        'value':50,
        'source':'F'
      },
      {  
        'target':'D',
        'value':50,
        'source':'F'
      },
      {  
        'target':'D',
        'value':50,
        'source':'F'
      },
      {  
        'target':'D',
        'value':50,
        'source':'F'
      }
    ],
    'nodes':[  
      {  
        'node':'',
        'name':'D'
      },
      {  
        'node':'',
        'name':'F'
      },
      {  
        'node':'',
        'name':'D'
      },
      {  
        'node':'',
        'name':'F'
      },
      {  
        'node':'',
        'name':'D'
      },
      {  
        'node':'',
        'name':'F'
      },
      {  
        'node':'',
        'name':'D'
      },
      {  
        'node':'',
        'name':'F'
      },
      {  
        'node':'',
        'name':'D'
      },
      {  
        'node':'',
        'name':'F'
      },
      {  
        'node':'',
        'name':'D'
      },
      {  
        'node':'',
        'name':'F'
      },
      {  
        'node':'',
        'name':'D'
      },
      {  
        'node':'',
        'name':'F'
      },
      {  
        'node':'',
        'name':'D'
      },
      {  
        'node':'',
        'name':'F'
      },
      {  
        'node':'',
        'name':'D'
      },
      {  
        'node':'',
        'name':'F'
      },
      {  
        'node':'',
        'name':'D'
      },
      {  
        'node':'',
        'name':'F'
      },
      {  
        'node':'',
        'name':'D'
      },
      {  
        'node':'',
        'name':'F'
      },
      {  
        'node':'',
        'name':'D'
      },
      {  
        'node':'',
        'name':'F'
      },
      {  
        'node':'',
        'name':'D'
      },
      {  
        'node':'',
        'name':'F'
      },
      {  
        'node':'',
        'name':'D'
      },
      {  
        'node':'',
        'name':'F'
      },
      {  
        'node':'',
        'name':'D'
      },
      {  
        'node':'',
        'name':'F'
      }
    ]
  },
  'year':0  2006  1  2006  Name:Year,
  dtype:int64
}        

It is obviously wrong. Why does the output repeats the last row?

1 Answer 1

0

This is because when you append link/node into the collection, Python will only create a "reference" pointing to link/node (i.e., it did not fully copy the content in link/node, see Add an object to a python list). Therefore, in the following iteration, the content in collection will change as link/node changes.

The following simple codes demonstrated this phenomenon in Python:

mylist = []
for i in range(3):
    link["source"] = i 
    mylist.append(link)
    print(mylistA)

where link is the same as in your code. The outputs were as follows.

[{'source': 0, 'target': '', 'value': ''}]
[{'source': 1, 'target': '', 'value': ''}, {'source': 1, 'target': '', 'value': ''}]
[{'source': 2, 'target': '', 'value': ''}, {'source': 2, 'target': '', 'value': ''}, {'source': 2, 'target': '', 'value': ''}]

To avoid this, you can simply use copy, i.e., when appending, always appending a deepcopy of the corresponding object. Using the same example, the resulted code (only showing the changed line, other lines are the same) looked like:

mylist.append(copy.deepcopy(link))

After the change, you will get the desired results.

Sign up to request clarification or add additional context in comments.

Comments

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.