1

I am trying to create a Multiclass Text Classifier as explained here. However, my code is breaking at line:

NB_pipeline.fit(X_train, train[category])

Below is the error which I am getting:

File "pandas\hashtable.pyx", line 683, in pandas.hashtable.PyObjectHashTable.get_item (pandas\hashtable.c:12322)

I tried to find out what train[category] returns and I got same error.

1) X_train is a dataframe with one column and contains customer feedback.

2) train is a dataframe with two columns; first column contains customer review(same as X_train) and second column contains one of the 5 categories (Systems Error, Proactive Communication, Staff Behaviour, Website Functionalities, Others).

3) category is one of the above mentioned categories.

Below is the sample train dataframe:

Index           Feedback                                    Category
  0           While making payment got system error.         System error
              Staff behaviour was good at hotel

  1           While making payment got system error.         Staff Behaviour
              Staff behaviour was good at hotel
1
  • Can you include more of the stack trace and also include the minimal code in the question please. It should be self-contained and reproducible without following a separate website. But strip anything not relevant to the question (but is still reproducible). Commented Oct 8, 2018 at 9:15

1 Answer 1

2
+50

This is one of the most over-looked issue.

The reason for this error is that the "column" script is looking for is not available in the dataframe. All the 5 categories you have, should be columns in the input dataframe and rows will take 1/0 if one of the categories is applicable for the feedback/comment. Ideally, Your input dataframe should look like this.

Index           Feedback                                  System error    Staff Behaviour
  0           While making payment got system error.         1                  1
              Staff behaviour was good at hotel

  1           While making payment got system error.         1                  0

  2           Staff behaviour was good at hotel              0                  1

I have used same comment to show how input dataframe should look like.

Sign up to request clarification or add additional context in comments.

1 Comment

user7467529: Thank you posting your answer. A well deserved repution loss for me. :P

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.