32

I'm trying to create a SVM model from what I found in github here, but it keeps returning this error.

Traceback (most recent call last):
  File "C:\Users\Me\Documents\#e\projects\Sign-Language-Glove-master\modeling.py", line 22, in <module>
    train_features = train[['F1','F2','F3','F4','F5','X','Y','Z','C1','C2']]
  File "C:\Python27\lib\site-packages\pandas\core\frame.py", line 2934, in __getitem__
    raise_missing=True)
  File "C:\Python27\lib\site-packages\pandas\core\indexing.py", line 1354, in _convert_to_indexer
    return self._get_listlike_indexer(obj, axis, **kwargs)[1]
  File "C:\Python27\lib\site-packages\pandas\core\indexing.py", line 1161, in _get_listlike_indexer
    raise_missing=raise_missing)
  File "C:\Python27\lib\site-packages\pandas\core\indexing.py", line 1246, in _validate_read_indexer
    key=key, axis=self.obj._get_axis_name(axis)))
KeyError: u"None of [Index([u'F1', u'F2', u'F3', u'F4', u'F5', u'X', u'Y', u'Z', u'C1', u'C2'], dtype='object')] are in the [columns]"

This is my code.

import pandas as pd
dataframe= pd.read_csv("lettera.csv", delimiter=',')
df=pd.DataFrame(dataframe)

from sklearn.model_selection import train_test_split
train, test = train_test_split(df, test_size = 0.2)

train_features = train[['F1','F2','F3','F4','F5','X','Y','Z','C1','C2']]

And these are the contents of the csv file.

LABEL, F1, F2, F3, F4, F5, X, Y, Z, C1, C2

1, 631, 761, 739, 751, 743, 14120, -5320, 7404, 0, 0

1, 632, 759, 740, 751, 744, 14108, -5276, 7444, 0, 0

1, 630, 761, 740, 752, 743, 14228, -5104, 7680, 0, 0

1, 630, 761, 738, 750, 743, 14256, -5148, 7672, 0, 0

1, 632, 759, 740, 751, 744, 14172, -5256, 7376, 0, 0

1, 632, 759, 742, 751, 746, 14288, -5512, 7412, 0, 0

1, 632, 759, 742, 751, 744, 14188, -5200, 7416, 0, 0

1, 634, 759, 738, 751, 743, 14252, -5096, 7524, 0, 0

1, 630, 759, 739, 751, 743, 14364, -5124, 7612, 0, 0

1, 630, 759, 740, 751, 744, 14192, -5316, 7424, 0, 0

1, 631, 760, 739, 752, 743, 14292, -5100, 7404, 0, 0

1, 634, 759, 738, 751, 742, 14232, -5188, 7468, 0, 0

1, 632, 759, 740, 751, 744, 14288, -5416, 7552, 0, 0

1, 630, 760, 739, 752, 743, 14344, -5072, 7816, 0, 0

1, 631, 760, 739, 752, 743, 14320, -4992, 7444, 0, 0

1, 630, 762, 739, 751, 746, 14220, -5172, 7544, 0, 0

1, 630, 759, 739, 751, 742, 14280, -5176, 7416, 0, 0

1, 630, 760, 738, 752, 740, 14360, -5028, 7468, 0, 0

1, 632, 759, 738, 752, 741, 14384, -5108, 7364, 0, 0

1, 629, 757, 737, 751, 741, 14224, -5108, 7536, 0, 0

1, 629, 758, 740, 751, 744, 14412, -5136, 7956, 0, 0

1, 629, 761, 740, 750, 744, 14468, -4868, 7100, 0, 0

1, 629, 760, 738, 752, 741, 14504, -4964, 6600, 0, 0

1, 629, 758, 738, 749, 741, 14440, -5112, 6828, 0, 0

1, 629, 760, 738, 752, 741, 14484, -5016, 7556, 0, 0
1
  • Also try to load your csv as a tab-separated dataframe: dataframe= pd.read_csv("lettera.csv", sep='\t') Commented May 5, 2020 at 16:57

6 Answers 6

35

The problem is that there are spaces in your column names; here is what I get when I save your data and load the dataframe as you have done:

df.columns
# result:
Index(['LABEL', ' F1', ' F2', ' F3', ' F4', ' F5', ' X', ' Y', ' Z', ' C1',
       ' C2'],
      dtype='object')

so, putting back these spaces in the column names eliminates the error:

train_features = train[[' F1',' F2',' F3',' F4',' F5',' X',' Y',' Z',' C1',' C2']] # works OK

But arguably, having spaces in your column names is not good practice (you saw what can happen!); so it is better to eliminate them during loading. Here is the end to end code to do that (eliminating also the unnecessary second dataframe):

import pandas as pd
df= pd.read_csv("lettera.csv", delimiter=',', header=None, skiprows=1, names=['LABEL','F1','F2','F3','F4','F5','X','Y','Z','C1','C2'])

from sklearn.model_selection import train_test_split
train, test = train_test_split(df, test_size = 0.2)
train_features = train[['F1','F2','F3','F4','F5','X','Y','Z','C1','C2']] # works OK
Sign up to request clarification or add additional context in comments.

Comments

5

In my case, it was because the column names in my dataframes had spaces. I renamed column names in my df, by replacing spaces with _.

# remove special character
df.columns = df.columns.str.replace(' ', '')

Comments

1

I had the same error trying to create new columns on a dataframe out of apply() results :

>>> df[["foo","bar"]] = df.apply(lambda r: ["foobar","baz"], axis=1)
"None of [Index(['foo', 'bar'], dtype='object')] are in the [columns]"

The solution was simply to use the result_type="expand" argument for apply() :

df[["foo","bar"]] = df.apply(lambda r: ["foobar","baz"], axis=1, result_type="expand")

I found this solution on this answer, which deserves upvotes.

Comments

0

Try with

import pandas as pd
dataframe= pd.read_csv("lettera.csv", delimiter=','sep=r', ') 

I had the same problem with ", ;, ", - and I saw this (", ") in your data.

If you can see the separate symbol in your dataset, you can use sep =.

Comments

0

In my case change in read_csv from sep=';' to sep=',' fixed issue.

Comments

-1

adding index_col=0 parameter while reading a dataset

1 Comment

As it’s currently written, your answer is unclear. Please edit to add additional details that will help others understand how this addresses the question asked. You can find more information on how to write good answers in the help center.

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.