# 使用袋外误差评估随机森林模型
原创
©著作权归作者所有:来自51CTO博客作者六mo神剑的原创作品,请联系作者获取转载授权,否则将追究法律责任
在不使用交叉验证的情况下使用袋外误差评估随机森林模型
# 使用袋外误差评估随机森林模型 使用袋外样本
from sklearn.ensemble import RandomForestClassifier
from sklearn import datasets
iris = datasets.load_iris()
features = iris.data
target = iris.target
randomforest = RandomForestClassifier(
random_state=0, n_estimators=1000, oob_score=True, n_jobs=-1)
model = randomforest.fit(features, target)
# 查看袋外误差
randomforest.oob_score_
0.9533333333333334
Discussion
In random forests, each decision tree is trained using a boostrapped subset of observations. This means that for every tree there is a separate subset of observations not being used to train that tree. These are called out-of-bag (OOB) observations. We can use OOB observations as a test set to evaluate the performance of our random forest.
For every observation, the learning algorithm compares the observation's true vlaue with the prediction from a subset of trees not trained using that observation. The overall score is calculated and provides a single measure of a random forest's performance. OOB score estimation is an alternative to cross-validation