I have a CSV file with two columns with no header. I want to compare these two columns and find out if column one matches with the list of column 2 and extract the matching values into a new CSV file (output.csv), and delete the whole row if column 2 does not have matching values with column 1. For example,
Input.csv:
1,"[0, 10, 12, 13, 16, 25, 32, 35, 60, 86, 98, 108, 168, 172, 222, 251, 275, 278, 325, 365]"
60,"[12014, 25665, 28278]"
86,"[0, 6, 7, 10, 12, 25, 76, 156, 174, 176, 181, 188, 365, 392, 438]"
108,"[1, 16, 21, 32, 35, 61, 81, 83, 95, 138, 153, 204, 222]"
438,"[30549]"
28278,"[60, 120, 140, 505, 3939, 4034, 7213, 7308, 8784, 14126, 14147, 15197, 16842, 20022, 28229]"
output.csv:
1,"[60, 108]"
60,"[28278]"
108,"[1]"
28278,"[60]"
I have tried this code,
import csv
with open('input.csv', 'r') as csvfile:
csvreader = csv.reader(csvfile, delimiter='\t')
nodes_in_1 = set()
nodes_in_2 = set()
for line in csvreader:
nodes_in_1.add(line[0])
nodes_in_2.add(line[1])
nodes_in_both = nodes_in_1.intersection(nodes_in_2)
with open('output.csv', 'w') as f_out:
f_out.write(nodes_in_both + '\n')
I am a beginner. Thank you for the help.