How to resolve keyerror in pandas using python?

Question

I have three CSV files. The first (csv1) can be considered a positive dataset where the first column (column 1) consists of certain IDs. The same goes second column as well. The data in csv1 are paired data meaning the corresponding entries in the CSV cells are pairs. Ex:

colA    colB
 A.1     B.1
 C.1     D.1

Here, A.1 and B.1 can be considered a pair, and the same goes for C.1 and D.1. In the second file (csv2), it only consists of the data of entries of Column A of file 1. Ex:

Col   X1    X2    X3    X4 
A.1  0.1   0.2   0.3   0.4
C.1  0.2   0.3   0.4   0.5

And similarly, the third file (csv3) consists of the data of entries of Column B of file 1. Ex:

Col   X1    X2    X3    X4 
B.1  0.1   0.2   0.3   0.4
D.1  0.2   0.3   0.4   0.5

I am writing a code where I first import all the three files and then iterate through the length of column A of file 1 and assign the values of the first cell of Column A and Column B to x and y respectively. I want to write a code where after assigning the respective values to x and y I will search whether these values are in file2 (x value) and file 3 (y value). If it is there then I want to extract the corresponding rows and concatenate them and save them in a separate CSV.

So, if my "x" is assigned a value of A.1 (hereby assigning I am assigning the string A.1) and "y" is assigned a value of B.1, then I want my code to first search if A.1 is there in file2 and B.1 is there in file3. If it is there, I want to extract the corresponding row values for A.1 (0,1,0.2,0.3,0.4...) and B.1 (0.2,0.3,0.4,0.5...) and concatenate their values:

  col     x1    x2    x3    x4   x5   x6
A.1_B.1   0.1   0.2   0.3   0.4  0.2  0.3

This is what I have written, but I am facing a "Keyerror". Whereas, when I checked my CSV file the ID is there. Any help would be much appreciated.

file1 = pd.read_csv("/home/file1.csv")
file2 = pd.read_csv("/home/file2.csv")
file3 = pd.read_csv("/home/file3.csv")

for i in range(len(file1['ID'])):
    x = ID_A[i]
    y = ID_B[i]
    if x in CT_ID_A:
        if y in CT_ID_B:
            d1 = file2.loc[x]
            d2 = file3.loc[y]
            d3 = pd.concat([d1,d2],axis=1)

here, ID_A and ID_B consist of the corresponding IDs of columns of file1, and CT_ID_A and CT_IS_B consist of IDs of file2 and file3, that is:

ID_A = ['A.1','C.1']
ID_B = ['B.1','D.1']
CT_ID_A = ['A.1','C.1']
CT_ID_B = ['B.1','C.1']

Shisui Otsutsuki · Accepted Answer · 2022-08-23 11:26:27Z

1

If your key error is ID then there is a possibility that yoyr csv file header does not have any column with the name ID

answered Aug 23, 2022 at 11:26

Shisui Otsutsuki

3412 silver badges12 bronze badges

Sign up to request clarification or add additional context in comments.

1 Comment

eseuteo Over a year ago

This answer is right. Also you are using ID_A, ID_B, CT_ID_A and CT_ID_B when you haven't declared them.

Collectives™ on Stack Overflow

How to resolve keyerror in pandas using python?

1 Answer 1

1 Comment

Your Answer

Hot Network Questions

Collectives™ on Stack Overflow

1 Answer 1

1 Comment

Your Answer

Sign up or log in

Post as a guest

Related