I'm a bit lost with the use of Feature Hashing in Python Pandas .
I have the a DataFrame with multiple columns, with many information in different types. There is one column that represent a class for the data.
Example:
col1 col2 colType
1 1 2 'A'
2 1 1 'B'
3 2 4 'C'
My goal is to apply FeatureHashing for the ColType, in order to be able to apply a Machine Learning Algorithm.
I have created a separate DataFrame for the colType, having something like this:
colType value
1 'A' 1
2 'B' 2
3 'C' 3
4 'D' 4
Then, applied Feature Hashing for this class Data Frame. But I don't understand how to add the result of Feature Hashing to my DataFrame with the info, in order to use it as an input in a Machine Learning Algorithm.
This is how I use FeatureHashing:
from sklearn.feature_extraction import FeatureHasher
fh = FeatureHasher(n_features=10, input_type='string')
result = fh.fit_transform(categoriesDF)
How do I insert this FeatureHasher result, to my DataFrame? How bad is my approach? Is there any better way to achieve what I am doing?
Thanks!