I'm using grid search to fit machine learning model parameters.
I typed in the following code (modified from the sklearn documentation page: http://scikit-learn.org/stable/modules/generated/sklearn.grid_search.GridSearchCV.html)
from sklearn import svm, grid_search, datasets, cross_validation
# getting data
iris = datasets.load_iris()
# grid of parameters
parameters = {'kernel':('linear', 'poly'), 'C':[1, 10]}
# predictive model (support vector machine)
svr = svm.SVC()
# cross validation procedure
mycv = cross_validation.StratifiedKFold(iris.target, n_folds = 2)
# grid search engine
clf = grid_search.GridSearchCV(svr, parameters, mycv)
# fitting engine
clf.fit(iris.data, iris.target)
However, when I look at clf.estimator, I get the following:
SVC(C=1.0, cache_size=200, class_weight=None, coef0=0.0, degree=3, gamma=0.0,
kernel='rbf', max_iter=-1, probability=False, random_state=None,
shrinking=True, tol=0.001, verbose=False)
How did I end up with a 'rbf' kernel? I didn't specify it as an option in my parameters.
What's going on?
Thanks!
P.S. I'm using '0.15-git' version for sklearn.
Addendum: I noticed that clf.best_estimator_ gives the right output. So what is clf.estimator doing?
kernelkey should have a list as its values. i.e.['linear', 'poly'](square brackets).rbfjust showed up because it is the default.estimatoris an object of theGridSearchCVclass. If you create an instance of this class, i.e.clf,.estimatorwill return the object and in this case, since your initial code was erroneous, it returned the default.