python计算每个类别的召回率和准确率 python计算召回率代码

转载

智能领航员 2024-08-13 08:18:09

文章标签 python 召回率召回率混淆矩阵正例 文章分类 Python 后端开发

在Python中的sklearn中的metrics中有很多对于模型评估方法的选项，本篇文章主要介绍其中关于分类算法的模型评估，主要是记录有哪些方法，这些方法的数学含义，及如何用这种方法来评估模型。

在计算之前需要导入相应的函数库

#导入相应的函数库
from sklearn.metrics import accuracy_score
from sklearn.metrics import precision_score
from sklearn.metrics import confusion_matrix
from sklearn.metrics import classification_report
from sklearn.metrics import cohen_kappa_score
from sklearn.metrics import f1_score
from sklearn.ensemble import RandomForestClassifier
from sklearn import datasets
from sklearn.model_selection import train_test_split
#建模分析
iris = datasets.load_iris()
x_train,x_test,y_train,y_test = train_test_split(iris['data'],iris['target'],random_state=41)
forest_clf = RandomForestClassifier(random_state=41)
forest_clf.fit(x_train,y_train)
y_pred = forest_clf.predict(x_test)
1、accuracy_score 与 precision_score

accuracy_score准确率，顾名思义就是分类结果中正确分类的数据比总数目（不论是两个还是多类）；

precision_score 这个有时人们也称为其准确率，但是它有另外一个名称查全率，这个就是有正例和负例的区别了（一般来说正例就是我们所关注的那个类别），这个的准确定义为：

(即预测结果中正例所占比例)

这是对于两类的问题，对于多类的问题呢？同样也是可以计算的，这里就引入的宏平均和微平均的问题了，宏平均（先对每一个类统计指标值，然后在对所有类求算术平均值），微平均（是对数据集中的每一个实例不分类别进行统计建立全局混淆矩阵，然后计算相应指标），更多关于这个评价指标的问题请参考：谈谈评价指标中的宏平均和微平均 - robert_ai -

print('分类准确率为：',accuracy_score(y_test,y_pred))

print('宏平均准确率：',precision_score(y_test,y_pred,average='macro'))

print('微平均准确率：',precision_score(y_test,y_pred,average='micro'))

#结果

#分类准确率为： 0.921052631579

#宏平均准确率： 0.944444444444

#微平均准确率： 0.921052631579

2、confusion_matrix

confusion_matrix混淆矩阵，从混淆矩阵中能够更加直观的看出有多少类别被分错了，分错到哪些类别中了。上述预测的混淆矩阵（通过label可以指定类别并返回相应的混淆矩阵）为：

print('混淆矩阵为：\n',confusion_matrix(y_test,y_pred,labels=[0,1,2]))
#结果
#
#混淆矩阵为：
[[10 0 0]
[ 0 15 0]
[ 0 3 10]]

3、recall_score

召回率也是模型评估中常用的方法，其定义如下：

(即真实正例中最后预测为正例所占的比例)

同样对于召回率也是有微平均和宏平均的概念。

print('宏平均召回率为：',recall_score(y_test,y_pred,average='macro'))
print('微平均召回率为：',recall_score(y_test,y_pred,average='micro'))
#结果为：
#宏平均召回率为： 0.923076923077
#微平均召回率为： 0.921052631579

4、f1_score与fbeta_score

F1值作为准确率和召回率的组合，常作为模型选择的指标。其定义为：

(即准确率和召回率的调和平均数)

print('宏平均f1值为：',f1_score(y_test,y_pred,average='macro'))
print('微平均f1值为：',f1_score(y_test,y_pred,average='micro'))

#结果

#宏平均f1值为： 0.926218708827

#微平均f1值为： 0.921052631579

F_beta是F1值的一般形式：

print('宏平均F_beta值为：',fbeta_score(y_test,y_pred,beta=0.5,average='macro'))

print('微平均F_beta值为：',fbeta_score(y_test,y_pred,beta=0.5,average='micro'))

#宏平均F_beta值为： 0.935155063977

#微平均F_beta值为： 0.921052631579

5、分类综合报告

分类综合报告是对上述的一个综合，里面覆盖了查全率、召回率、f值、分类个数等。

print('分类报告：\n',classification_report(y_test,y_pred))

#分类报告：
precision recall f1-score support
0 1.00 1.00 1.00 10
1 0.83 1.00 0.91 15
2 1.00 0.77 0.87 13
avg / total 0.93 0.92 0.92 38
6、cohen_kappa_score
cohen_kappa得分是一个介于(-1, 1)之间的数. score>0.8意味着好的分类；0或更低意味着不好（实际是随机标签）。
print('cohen_kappa得分为：',cohen_kappa_score(y_test,y_pred))
#cohen_kappa得分为： 0.879237288136

本文章为转载内容，我们尊重原作者对文章享有的著作权。如有内容错误或侵权问题，欢迎原作者联系我们进行内容更正或删除文章。