3

I'm a bit lost with the use of Feature Hashing in Python Pandas .

I have the a DataFrame with multiple columns, with many information in different types. There is one column that represent a class for the data.

Example:

         col1   col2    colType
    1     1      2        'A'
    2     1      1        'B'
    3     2      4        'C' 

My goal is to apply FeatureHashing for the ColType, in order to be able to apply a Machine Learning Algorithm.

I have created a separate DataFrame for the colType, having something like this:

                   colType  value
           1         'A'       1
           2         'B'       2
           3         'C'       3
           4         'D'       4

Then, applied Feature Hashing for this class Data Frame. But I don't understand how to add the result of Feature Hashing to my DataFrame with the info, in order to use it as an input in a Machine Learning Algorithm.

This is how I use FeatureHashing:

  from sklearn.feature_extraction import FeatureHasher
  fh = FeatureHasher(n_features=10, input_type='string')
  result = fh.fit_transform(categoriesDF)

How do I insert this FeatureHasher result, to my DataFrame? How bad is my approach? Is there any better way to achieve what I am doing?

Thanks!

2 Answers 2

3

I know this answer comes in late, but I stumbled upon the same problem and found this works:

fh = FeatureHasher(n_features=8, input_type='string')
sp = fh.fit_transform(df['colType'])
df = pd.DataFrame(sp.toarray(), columns=['fh1', 'fh2', 'fh3', 'fh4', 'fh5', 'fh6', 'fh7', 'fh8'])
pd.concat([df1, df], axis=1)

This creates a dataframe out of the sparse matrix retrieved by the FeatureHasher and concatenates the matrix to the existing dataframe.

Sign up to request clarification or add additional context in comments.

Comments

0

I have switched to One Hot Coding, using something like this:

categoriesDF = pd.get_dummies(categoriesDF)

This function will create a column for every non-category value, with 1 or 0.

Comments

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.