Predict using randomForest package in R

Question

How can I use result of randomForest call in R to predict labels on some unlabled data (e.g. real world input to be classified)?
Code:

train_data = read.csv("train.csv")
input_data = read.csv("input.csv")
result_forest = randomForest(Label ~ ., data=train_data)
labeled_input = result_forest.predict(input_data) # I need something like this

train.csv:

a;b;c;label;
1;1;1;a;
2;2;2;b;
1;2;1;c;

input.csv:

a;b;c;
1;1;1;
2;1;2;

I need to get something like this

a;b;c;label;
1;1;1;a;
2;1;2;b;

@eipi10, thanks a lot. Thats my first day of R. You can rewrite your comment as answer to let me accept it — xander27
– xander27, Commented Sep 3, 2016 at 16:42
There are lots of questions on Stack Overflow related to predict, so I'd guess this question is probably a duplicate. No need for me to add an answer. The key thing to remember for future reference is that just about every modeling function is R has a predict "method", meaning that if you run predict on the model object, it will return predictions for the training data by default, or predictions for new data if you use the newdata argument. — eipi10
– eipi10, Commented Sep 3, 2016 at 16:48

Easthaven · Accepted Answer · 2016-09-03 19:50:03Z

2

Let me know if this is what you are getting at.

You train your randomforest with your training data:

# Training dataset
train_data <- read.csv("train.csv")
#Train randomForest
forest_model <- randomForest(label ~ ., data=train_data)

Now that the randomforest is trained, you want to give it new data so it can predict what the labels are.

input_data$predictedlabel <- predict(forest_model, newdata=input_data)

The above code adds a new column to your input_data showing the predicted label.

answered Sep 3, 2016 at 19:50

Easthaven

1181 silver badge8 bronze badges

Sign up to request clarification or add additional context in comments.

Comments

manaR · Accepted Answer · 2016-09-03 19:51:15Z

0

You can use the predict function

for example:

data(iris)
set.seed(111)
ind <- sample(2, nrow(iris), replace = TRUE, prob=c(0.8, 0.2))
iris.rf <- randomForest(Species ~ ., data=iris[ind == 1,])
iris.pred <- predict(iris.rf, iris[ind == 2,])

This is from http://ugrad.stat.ubc.ca/R/library/randomForest/html/predict.randomForest.html

answered Sep 3, 2016 at 19:51

manaR

371 silver badge5 bronze badges

Collectives™ on Stack Overflow

Predict using randomForest package in R

2 Answers 2

Comments

Comments

Your Answer

Hot Network Questions

Collectives™ on Stack Overflow

2 Answers 2

Comments

Comments

Your Answer

Sign up or log in

Post as a guest

Related