I have a spreadsheet with lists of names of people that a particular person reported working with on a number of projects. If I import it to pandas as a dataframe it will look like this:
1 2
Jane ['Fred', 'Joe'] ['Joe', 'Fred', 'Bob']
Fred ['Alex'] ['Jane']
Terry NaN ['Bob']
Bob ['Joe'] ['Jane', 'Terry']
Alex ['Fred'] NaN
Joe ['Jane'] ['Jane']
I want to create an adjacency matrix that will look like this:
Jane Fred Terry Bob Alex Joe
Jane 0 2 0 1 0 2
Fred 1 0 0 0 1 0
Terry 0 0 0 1 0 0
Bob 1 0 1 0 0 1
Alex 0 1 0 0 0 0
Joe 2 0 0 0 0 0
This matrix, generally, will NOT be symmetric because of inconsistency with people's reports. I have been creating the adjacency matrix just by looping through the dataframe and incrementing the the matrix elements accordingly. Apparently, looping through dataframes is NOT recommended and inefficient, so does anyone have a suggestion on how his could be done more pythonically?
list? can you verify withtype(df.iloc[0,0])list(unless it wasnp.NaN). BTW, instead of names, I actually have integer ID numbers (names have been anonymized) so the contents of each cell is actually a list of integers.