序号 描述 代码 备注
1、

获得数据集某特征列/标签列

 特征列:
1  for i in range(numFeatures):   
2         featList = [example[i] for example in dataSet]
 
 标签列:
1 classList = [example[-1] for example in dataSet]
 
 
 2、 统计数据集中某列的值出现次数   
 方法1:
1 classCount = {}
2 for vote in classList:
3     if vote not in classCount.keys():
4         classCount[vote] = 0
5     classCount[vote] +=1

 

方法2:

1 classCount = {}
2 for vote in classList:
3     classCount[vote] = classCount.get(vote,0)+1

 

 
 
 3、

 统计列表中各元素出现次数,并排序

并返回最大次数的元素

1 import operator
2 def majority(list):
3     counts = {}
4     for key in list:
5         counts[key]=counts.get(key,0)+1
6     sortedResult = sorted(counts.items(),key=operator.itemgetter(1),reverse=True)
7     return sortedResult[0][0]

 

 
 4、  获得列表中元素类别 new_list = set(list)