老师我不会:) 2021-04-07 11:43 采纳率: 66.7%
浏览 31
已结题

读下面的代码,解释这段代码的作用?

阅读
from sklearn.datasets import load_digits
from sklearn.tree import DecisionTreeClassifier
from sklearn import metrics
import numpy as np
import time

start = time.time()

#对于load_digits所加载的数据集的介绍,可以参见 https://blog.csdn.net/Asun0204/article/details/75607948
digits = load_digits()

train_size = 1500
train_x, train_y = digits.data[:train_size], digits.target[:train_size]
test_x, test_y = digits.data[train_size:], digits.target[train_size:]

# --- SECTION 2 ---
# Create our bootstrap samples and train the classifiers

ensemble_size = 10
base_learners = []

for _ in range(ensemble_size):
    # 从训练集中随机抽取一些样本来训练基分类器,总共要训练ensemble_size个基分类器
    bootstrap_sample_indices = np.random.randint(0, train_size, size=train_size)
    bootstrap_x = train_x[bootstrap_sample_indices]
    bootstrap_y = train_y[bootstrap_sample_indices]
    dtree = DecisionTreeClassifier()
    dtree.fit(bootstrap_x, bootstrap_y)
    base_learners.append(dtree)

# --- SECTION 3 ---
# 测试基分类器的分类能力
base_predictions = []
base_accuracy = []
for learner in base_learners:
    predictions = learner.predict(test_x)
    base_predictions.append(predictions)
    acc = metrics.accuracy_score(test_y, predictions)
    base_accuracy.append(acc)

# --- SECTION 4 ---
#

ensemble_predictions = []
# Find the most voted class for each test instance
for i in range(len(test_y)):
    # Count the votes for each class
    counts = [0 for _ in range(10)]
    for learner_predictions in base_predictions:
        counts[learner_predictions[i]] = counts[learner_predictions[i]]+1



    final = np.argmax(counts)

    ensemble_predictions.append(final)

ensemble_acc = metrics.accuracy_score(test_y, ensemble_predictions)

end = time.time()


# --- SECTION 5 ---
# 打印预测结果
print('Base Learners:')
print('-'*30)
for index, acc in enumerate(sorted(base_accuracy)):
    print(f'Learner {index+1}: %.2f' % acc)
print('-'*30)
print('Bagging: %.2f' % ensemble_acc)

print('Total time: %.2f' % (end - start))
解释着段代码的作用
for i in range(len(test_y)):
    counts = [_ ()]
    learner_predictions base_predictions:
        counts[learner_predictions[i]] = counts[learner_predictions[i]]+   final = np.argmax(counts)
    ensemble_predictions.append(final)
  • 写回答

1条回答 默认 最新

  • kaili_ya 2021-04-20 09:58
    关注
    for i in range(len(test_y)):    
        # Count the votes for each class
        counts = [0 for _ in range(10)]   # 10类,预测为每一类的基分类器的数量
        for learner_predictions in base_predictions:  # 如果预测为该类,该类的数量加1
            counts[learner_predictions[i]] = counts[learner_predictions[i]]+1
        final = np.argmax(counts)   # 选择预测为该类的基分类器数量最多的作为最终类别,多数投票原则
        ensemble_predictions.append(final)   # 最终的分类结果
    ensemble_acc = metrics.accuracy_score(test_y, ensemble_predictions)   # 计算accuracy

    其实,简单来说就是一个多数投票原则,基分类器分类为哪一类的最多就将该样本预测为该类别

    本回答被题主选为最佳回答 , 对您是否有帮助呢?
    评论

报告相同问题?

问题事件

  • 系统已结题 8月4日
  • 已采纳回答 7月27日

悬赏问题

  • ¥30 怎么使用AVL fire ESE软件自带的优化模式来优化设计Soot和NOx?
  • ¥30 如何实现github RealtimeTTS项目的打包
  • ¥15 Ubuntu20.04.4.LTS系统如何下载安装VirtualBox虚拟机?
  • ¥15 如何用QDomDocument读取内容为空格的xml数据
  • ¥15 请阅读下面代码,帮我修改下代码
  • ¥15 关于#microsoft#的问题:电脑启动后不显示桌面图标和窗口,除任务栏外无法操作任何东西
  • ¥15 如何输入百度,显示本地下载的html文件页面,地址栏还显示百度的地址
  • ¥15 通过kinect制作换装程序但是服装不贴合(标签-ar)
  • ¥20 matlab如何绘制三维瀑布图
  • ¥15 关于用abap来解决动态规划的问题,但是要求输出索引值,这个是难点