2301_81337622 2024-04-24 15:18 采纳率: 0%
浏览 7

deephit模型如何画校准曲线

deephit模型如何画校准曲线 代码怎么实现,以下是目前的代码

num_durations = 130        #(最大生存时间/12)
labtrans = DeepHitSingle.label_transform(num_durations)
get_target = lambda df: (df['survival'].values, df['OS'].values)
y_train = labtrans.fit_transform(*get_target(df_train))
y_val = labtrans.transform(*get_target(df_val))
train = (x_train, y_train)
val = (x_val, y_val)
# 获取验证集的真实生存时间和事件状态  
durations_val, events_val = get_target(df_val)

durations_test, events_test = get_target(df_test)
in_features = x_train.shape[1]
num_nodes = [32, 64, 32]   #可变加层,调整节点数
out_features = labtrans.out_features
batch_norm = True
dropout = 0.5  #调整过拟合情况

net = tt.practical.MLPVanilla(in_features, num_nodes, out_features, batch_norm, dropout)
model = DeepHitSingle(net, tt.optim.AdamW, alpha=0.0001, sigma=0.1, duration_index=labtrans.cuts)     
batch_size = 256
lr_finder = model.lr_finder(x_train, y_train, batch_size, tolerance=3)
_ = lr_finder.plot()
lr_finder.get_best_lr()
model.optimizer.set_lr(lr_finder.get_best_lr())
epochs = 400
callbacks = [tt.callbacks.EarlyStopping()]
log = model.fit(x_train, y_train, batch_size, epochs, callbacks, val_data=val)
_ = log.plot()
surv = model.predict_surv_df(x_test)

  • 写回答

6条回答 默认 最新

  • 专家-赤兔[在线] 优质创作者: 编程框架技术领域 2024-04-24 15:18
    关注

    引自免费微信小程序:皆我百晓生

    在给定的代码基础上,我们可以添加一个函数来计算并绘制DeepHit模型的校准曲线。首先,我们需要导入所需的库,并定义一个函数来计算Brier分数,然后根据这个分数来绘制校准曲线。下面是完整的代码示例:

    import numpy as np
    import pandas as pd
    from sklearn.metrics import brier_score_loss
    import matplotlib.pyplot as plt
    from collections import defaultdict
    from typing import List, Tuple
    
    def calculate_brier_scores(model, durations, events, n_intervals=10):
        """
        计算Brier分数
        :param model: 已训练好的DeepHit模型
        :param durations: 生存时间数组
        :param events: 事件发生标志数组
        :param n_intervals: 时间间隔数量
        :return: Brier分数列表
        """
        brier_scores = []
        
        # 将生存时间分为n个等间距的时间段
        interval_width = durations.max() / n_intervals
        intervals = np.arange(0, durations.max(), interval_width)
        interval_boundaries = [(intervals[i], intervals[i+1]) for i in range(len(intervals)-1)]
    
        for start, end in interval_boundaries:
            # 提取每个时间段内的样本
            mask = (durations >= start) & (durations < end)
            durations_in_interval = durations[mask]
            events_in_interval = events[mask]
    
            # 预测生存概率
            surv_probabilities = model.predict_surv_df(pd.DataFrame(x_test)[x_train.columns]).loc[:, 'survival_probability']
    
            # 计算Brier分数
            brier_score = brier_score_loss(events_in_interval, surv_probabilities[mask])
            brier_scores.append(brier_score)
    
        return brier_scores
    
    def plot_calibration_curve(brier_scores, interval_boundaries):
        """
        绘制校准曲线
        :param brier_scores: Brier分数列表
        :param interval_boundaries: 时间段边界
        """
        plt.figure(figsize=(8, 6))
        plt.plot(interval_boundaries, brier_scores, label='Brier Score', marker='o')
        plt.xlabel('Time (months)')
        plt.ylabel('Brier Score')
        plt.title('Calibration Curve for DeepHit Model')
        plt.legend()
        plt.grid(True)
        plt.show()
    
    # 调用上述函数计算Brier分数和绘制校准曲线
    brier_scores = calculate_brier_scores(model, durations_val, events_val)
    plot_calibration_curve(brier_scores, interval_boundaries)
    

    这段代码首先定义了两个函数:calculate_brier_scoresplot_calibration_curve。前者用于计算Brier分数,后者则用来绘制校准曲线。接着,在主程序中调用这两个函数,传入已经训练好的模型、验证集中真实的生存时间和事件状态,以及要划分的时间间隔数量。最后,会显示一个表示模型预测准确性的校准曲线。

    注意,你需要确保数据的格式与模型输入匹配,并且在运行这段代码之前,确保已经正确地完成了模型训练和验证。此外,你可以根据实际情况调整n_intervals参数来改变时间间隔的数量。

    评论 编辑记录

报告相同问题?

问题事件

  • 创建了问题 4月24日