KeyError: "None of [['', '']] are in the [columns]" pandas python

Question

I would like to slice two columns in my data frame.

This is my code for doing this:

import pandas as pd
df = pd.read_csv('source.txt',header=0)
cidf = df.loc[:,['vocab','sumCI']]

This is a sample of data:

ID  vocab   sumCI   sumnextCI   new_diff
450      statu    3.0        0.0       3.0
391     provid    4.0        1.0       3.0
382  prescript    3.0        0.0       3.0
300   lymphoma    2.0        0.0       2.0
405      renew    2.0        0.0       2.0

Firstly I got this error:

KeyError: “None of [['', '']] are in the [columns]”'

What I have tried:

I tried putting a header with index 0 while reading the file,

I tried to rename columns with this code:

df.rename(columns=df.iloc[0], inplace=True)

I also tried this:

df.columns = df.iloc[1]
df = df.reindex(df.index.drop(0))

Also tried comments in this link

None of the above resolved the issue.

rafaelc · Accepted Answer · 2018-08-23 01:08:27Z

11

By the print you posted, it seems like you have whitespaces as delimiters. pd.read_csv will read using , as default separator, so you have to explicitly state it:

pd.read_csv('source.txt',header=0, delim_whitespace=True)

answered Aug 23, 2018 at 1:08

rafaelc

59.4k15 gold badges64 silver badges87 bronze badges

Sign up to request clarification or add additional context in comments.

Comments

Mansour Torabi · Accepted Answer · 2021-11-04 21:05:28Z

3

Maybe you have white spaces around your column names, double check your csv file

answered Nov 4, 2021 at 21:05

Mansour Torabi

4213 silver badges6 bronze badges

1 Comment

nicomp Over a year ago

This is also something to be aware of when using read_sql_query() : a column name in a table could have leading or trailing spaces. Ugh.

cottontail · Accepted Answer · 2023-02-03 09:54:50Z

3

If you get this (or similar) error, check if your dataframe contains these columns. The following should be returning True in order for the indexing to work.

cols = ['vocab', 'sumCI']
set(df.columns).issuperset(cols)

If the above returns False, then you'll need to process the columns.

A common culprit is leading/trailing space, so try

df.columns = df.columns.str.strip()

Other common problems could be double underscore, double space or em dash (—) between words in legitimate column names. Then you may try regex to remove surplus space and underscores and replace em dash by en dash in column names, etc.

df.columns = df.columns.to_series().replace({r'\s+': ' ', r'_+': '_', r'—': '-'}, regex=True)

answered Feb 3, 2023 at 9:54

cottontail

25.6k25 gold badges184 silver badges176 bronze badges

Comments

adiga · Accepted Answer · 2019-06-25 08:03:55Z

2

simply write code to create a new CSV file and use a new file

 import numpy as np
 import pandas as pd
 import matplotlib.pyplot as plt
 pd.read_csv('source.txt',header=0, delim_whitespace=True)
 headers = ['ID','vocab','sumCI','sumnextCI','new_diff']
 df.columns = headers 
 df.to_csv('newsource.txt')

edited Jun 25, 2019 at 8:03

adiga

35.4k9 gold badges65 silver badges88 bronze badges

answered Jun 25, 2019 at 7:52

Sejpalsinh Jadeja

3652 silver badges23 bronze badges

Comments

Deepstop · Accepted Answer · 2019-09-17 20:03:03Z

2

You can try doing this:

pd.read_csv('source.txt',header=0, delim_whitespace=True)

If you have any white spaces in the data you're will get an error, so delim_whitespace is included to remove those in case they're in the data.

edited Sep 17, 2019 at 20:03

Deepstop

3,8372 gold badges12 silver badges23 bronze badges

answered Sep 17, 2019 at 19:06

kaushik Tummalapali

213 bronze badges

Collectives™ on Stack Overflow

KeyError: "None of [['', '']] are in the [columns]" pandas python

5 Answers 5

Comments

1 Comment

Comments

Comments

Comments

Your Answer

Linked

Hot Network Questions

Collectives™ on Stack Overflow

5 Answers 5

Comments

1 Comment

Comments

Comments

Comments

Your Answer

Sign up or log in

Post as a guest

Linked

Related