I'm developping a model to predict the target variable using the RandomForestRegressor from scikit.
I have developped a function to get the mse as below:
def get_mse(n_estimators, max_leaf_nodes, X_train, X_valid, y_train, y_valid):
model = RandomForestRegressor(n_estimators=n_estimators, max_leaf_nodes=max_leaf_nodes, random_state=0)
model.fit(X_train, y_train)
preds_val = model.predict(X_valid)
mse = mean_squared_error(y_valid, preds_val, squared = False)
return(mse)
I would like to use a for loop to get the best mse scores by combining a list of values for n_estimators and max_leaf_nodes
Below are the code that I wrote:
n_estimators = [100,150,200,250]
max_leaf_nodes = [10, 50, 100, 200]
for n_estimators,max_leaf_nodes in zip(n_estimators,max_leaf_nodes):
my_mse = get_mse(n_estimators,max_leaf_nodes, X_train, X_valid, y_train, y_valid)
print("N_estimators: %d \t\t Max leaf nodes: %d \t\t Mean Squared Error: %d" %(n_estimators, max_leaf_nodes, my_mse))
But when I run this for loop, it always return a mse of 0 for each combination of two hyperparameters.
I have tried my function by using the following code and it returns with the correct mse:
get_mse(200, 100, X_train, X_valid, y_train, y_valid)
I'm wondering why my for loop is not working properly by returning me always a 0 mse.
Could someone can help me to solve this issue ?
Thank you
%dwith%ffor mse in the format string? If the mean squared error is a float between 0 and 1, using%dwill always print zero.GridSearchscikit-learn.org/stable/modules/generated/… or something similar.