2

I am a newbie to python. I am trying iterate over rows of individual columns of a dataframe in python. I am trying to create an adjacency list using the first two columns of the dataframe taken from csv data (which has 3 columns).

The following is the code to iterate over the dataframe and create a dictionary for adjacency list:

df1 = pd.read_csv('person_knows_person_0_0_sample.csv', sep=',', index_col=False, skiprows=1) 

src_list = list(df1.iloc[:, 0:1])
tgt_list = list(df1.iloc[:, 1:2])
    adj_list = {}

    for src in src_list:
        for tgt in tgt_list:
            adj_list[src] = tgt


    print(src_list) 
    print(tgt_list)
    print(adj_list)

and the following is the output I am getting:

['933']
['4139']
{'933': '4139'}

I see that I am not getting the entire list when I use the list() constructor. Hence I am not able to loop over the entire data.

Could anyone tell me where I am going wrong?

To summarize, Here is the input data:

A,B,C
933,4139,20100313073721718
933,6597069777240,20100920094243187
933,10995116284808,20110102064341955
933,32985348833579,20120907011130195
933,32985348838375,20120717080449463
1129,1242,20100202163844119
1129,2199023262543,20100331220757321
1129,6597069771886,20100724111548162
1129,6597069776731,20100804033836982

the output that I am expecting:

933: [4139,6597069777240, 10995116284808, 32985348833579, 32985348838375]
1129: [1242, 2199023262543, 6597069771886, 6597069776731]
1
  • Could you please add the input data and expected output to the question as well? There should be some better solution that iterating over all values in a for-loop. Commented Aug 16, 2018 at 7:56

1 Answer 1

2

Use groupby and create Series of lists and then to_dict:

#selecting by columns names
d = df1.groupby('A')['B'].apply(list).to_dict()

#seelcting columns by positions
d = df1.iloc[:, 1].groupby(df1.iloc[:, 0]).apply(list).to_dict()

print (d)
{933: [4139, 6597069777240, 10995116284808, 32985348833579, 32985348838375],
 1129: [1242, 2199023262543, 6597069771886, 6597069776731]}
Sign up to request clarification or add additional context in comments.

2 Comments

Yes sure, I have tried this method, it works. But in general, I just want to know if there is a way to loop through columns of dataframes. Why does the list() constructor return only the first value? Is there a way to do the same task by iterating through the rows?
@sindhuja - First rule in pandas is avoid loops, because slow. And for selcet values need df1.iloc[:, 1] instaed df1.iloc[:, 0:1]

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.