dongfangshenyang 2022-08-17 10:03 采纳率: 100%
浏览 261
已结题

为什么loss和acc陡然下降 如何调整为宜?(深度学习 影像分割 分割 二值分类 TensorFlow keras unet )

问题遇到的现象和发生背景

深度学习 影像图斑分割 二分类
使用框架TensorFlow keras unet
显卡P5000
初始遥感影像样本旋转了3次 裁切得到训练及验证样本集 验证样本随机获取

问题相关代码,请勿粘贴截图

batch_size =8
input_size=256,256,3
epochs=260
learning_rate=5e-5
train_num=10456
validation_num=1306

用于配置训练模型(优化器、目标函数、模型评估标准)

model.compile(optimizer = Adam(lr = learning_rate), loss = 'categorical_crossentropy', metrics = ['accuracy'])

运行结果及报错内容

2022-08-16 20:43:33.834002: I tensorflow/stream_executor/platform/default/dso_loader.cc:44] Successfully opened dynamic library cudnn64_7.dll
2022-08-16 20:43:34.821246: W tensorflow/stream_executor/gpu/redzone_allocator.cc:314] Internal: Invoking GPU asm compilation is supported on Cuda non-Windows platforms only
Relying on driver to perform ptx compilation.
Modify $PATH to customize ptxas location.
This message will be only logged once.
2022-08-16 20:43:34.871322: I tensorflow/stream_executor/platform/default/dso_loader.cc:44] Successfully opened dynamic library cublas64_10.dll
1307/1307 [ ] - 1045s 800ms/step - loss: 0.6964 - accuracy: 0.8024 - val_loss: 0.6732 - val_accuracy: 0.8048

Epoch 00001: loss improved from inf to 0.69645, saving model to F:\Data\QQCT2025\RSICPRJ\model\model.hdf5
Epoch 2/260
1307/1307 [ ] - 1047s 801ms/step - loss: 0.6232 - accuracy: 0.9028 - val_loss: 0.6156 - val_accuracy: 0.8981

Epoch 00002: loss improved from 0.69645 to 0.62322, saving model to F:\Data\QQCT2025\RSICPRJ\model\model.hdf5
Epoch 3/260
1307/1307 [ ] - 1024s 784ms/step - loss: 0.5814 - accuracy: 0.9059 - val_loss: 0.5668 - val_accuracy: 0.9157

Epoch 00003: loss improved from 0.62322 to 0.58136, saving model to F:\Data\QQCT2025\RSICPRJ\model\model.hdf5
Epoch 4/260
1307/1307 [ ] - 960s 734ms/step - loss: 0.5437 - accuracy: 0.9053 - val_loss: 0.5916 - val_accuracy: 0.8800

Epoch 00004: loss improved from 0.58136 to 0.54374, saving model to F:\Data\QQCT2025\RSICPRJ\model\model.hdf5
Epoch 5/260
1307/1307 [ ] - 960s 734ms/step - loss: 0.5069 - accuracy: 0.9104 - val_loss: 0.5355 - val_accuracy: 0.9004

Epoch 00005: loss improved from 0.54374 to 0.50688, saving model to F:\Data\QQCT2025\RSICPRJ\model\model.hdf5
Epoch 6/260
1307/1307 [ ] - 959s 734ms/step - loss: 0.4756 - accuracy: 0.9103 - val_loss: 0.4023 - val_accuracy: 0.9107

Epoch 00006: loss improved from 0.50688 to 0.47559, saving model to F:\Data\QQCT2025\RSICPRJ\model\model.hdf5
Epoch 7/260
1307/1307 [ ] - 960s 735ms/step - loss: 0.4449 - accuracy: 0.9140 - val_loss: 0.4558 - val_accuracy: 0.8947

Epoch 00007: loss improved from 0.47559 to 0.44491, saving model to F:\Data\QQCT2025\RSICPRJ\model\model.hdf5
Epoch 8/260
1307/1307 [ ] - 960s 734ms/step - loss: 0.4195 - accuracy: 0.9141 - val_loss: 0.3656 - val_accuracy: 0.8967

Epoch 00008: loss improved from 0.44491 to 0.41947, saving model to F:\Data\QQCT2025\RSICPRJ\model\model.hdf5
Epoch 9/260
1307/1307 [ ] - 960s 734ms/step - loss: 0.3982 - accuracy: 0.9130 - val_loss: 0.3337 - val_accuracy: 0.9070

Epoch 00009: loss improved from 0.41947 to 0.39823, saving model to F:\Data\QQCT2025\RSICPRJ\model\model.hdf5
Epoch 10/260
1307/1307 [ ] - 960s 734ms/step - loss: 0.3777 - accuracy: 0.9132 - val_loss: 0.3189 - val_accuracy: 0.8995

Epoch 00010: loss improved from 0.39823 to 0.37770, saving model to F:\Data\QQCT2025\RSICPRJ\model\model.hdf5
Epoch 11/260
1307/1307 [ ] - 960s 735ms/step - loss: 0.3639 - accuracy: 0.9138 - val_loss: 0.2769 - val_accuracy: 0.9046

Epoch 00011: loss improved from 0.37770 to 0.36389, saving model to F:\Data\QQCT2025\RSICPRJ\model\model.hdf5
Epoch 12/260
1307/1307 [ ] - 960s 735ms/step - loss: 0.3607 - accuracy: 0.9160 - val_loss: 0.3507 - val_accuracy: 0.9007

Epoch 00012: loss improved from 0.36389 to 0.36071, saving model to F:\Data\QQCT2025\RSICPRJ\model\model.hdf5
Epoch 13/260
1307/1307 [ ] - 960s 734ms/step - loss: 0.3594 - accuracy: 0.9172 - val_loss: 0.2733 - val_accuracy: 0.8945

Epoch 00013: loss improved from 0.36071 to 0.35935, saving model to F:\Data\QQCT2025\RSICPRJ\model\model.hdf5
Epoch 14/260
1307/1307 [ ] - 960s 734ms/step - loss: 0.3529 - accuracy: 0.9225 - val_loss: 0.2974 - val_accuracy: 0.8967

Epoch 00014: loss improved from 0.35935 to 0.35292, saving model to F:\Data\QQCT2025\RSICPRJ\model\model.hdf5
Epoch 15/260
1307/1307 [ ] - 960s 735ms/step - loss: 0.3586 - accuracy: 0.9162 - val_loss: 0.2786 - val_accuracy: 0.9005

Epoch 00015: loss did not improve from 0.35292
Epoch 16/260
1307/1307 [ ] - 959s 734ms/step - loss: 0.3506 - accuracy: 0.9223 - val_loss: 0.3221 - val_accuracy: 0.9024

Epoch 00016: loss improved from 0.35292 to 0.35065, saving model to F:\Data\QQCT2025\RSICPRJ\model\model.hdf5
Epoch 17/260
1307/1307 [ ] - 959s 734ms/step - loss: 0.3480 - accuracy: 0.9220 - val_loss: 0.4305 - val_accuracy: 0.8949

Epoch 00017: loss improved from 0.35065 to 0.34804, saving model to F:\Data\QQCT2025\RSICPRJ\model\model.hdf5
Epoch 18/260
1307/1307 [ ] - 956s 731ms/step - loss: 0.3792 - accuracy: 0.8988 - val_loss: 0.8560 - val_accuracy: 0.1180

Epoch 00018: loss did not improve from 0.34804
Epoch 19/260
1307/1307 [ ] - 958s 733ms/step - loss: 0.3761 - accuracy: 0.8980 - val_loss: 0.7498 - val_accuracy: 0.1046

Epoch 00019: loss did not improve from 0.34804
Epoch 20/260
1307/1307 [ ] - 957s 732ms/step - loss: 0.3722 - accuracy: 0.8960 - val_loss: 0.7169 - val_accuracy: 0.1014

Epoch 00020: loss did not improve from 0.34804
Epoch 21/260
1307/1307 [ ] - 958s 733ms/step - loss: 0.3674 - accuracy: 0.8944 - val_loss: 0.6812 - val_accuracy: 0.8971

Epoch 00021: loss did not improve from 0.34804
Epoch 22/260
1307/1307 [ ] - 958s 733ms/step - loss: 0.3503 - accuracy: 0.9013 - val_loss: 0.6408 - val_accuracy: 0.8844

Epoch 00022: loss did not improve from 0.34804
Epoch 23/260
1307/1307 [ ] - 958s 733ms/step - loss: 0.0431 - accuracy: 0.1980 - val_loss: 1.1921e-07 - val_accuracy: 0.0978

Epoch 00023: loss improved from 0.34804 to 0.04306, saving model to F:\Data\QQCT2025\RSICPRJ\model\model.hdf5
Epoch 24/260
1307/1307 [ ] - 958s 733ms/step - loss: 1.1921e-07 - accuracy: 0.1007 - val_loss: 1.1921e-07 - val_accuracy: 0.1002

Epoch 00024: loss improved from 0.04306 to 0.00000, saving model to F:\Data\QQCT2025\RSICPRJ\model\model.hdf5
Epoch 25/260
1307/1307 [ ] - 958s 733ms/step - loss: 1.1921e-07 - accuracy: 0.1008 - val_loss: 1.1921e-07 - val_accuracy: 0.0957

Epoch 00025: loss improved from 0.00000 to 0.00000, saving model to F:\Data\QQCT2025\RSICPRJ\model\model.hdf5
Epoch 26/260
1307/1307 [ ] - 958s 733ms/step - loss: 1.1921e-07 - accuracy: 0.0996 - val_loss: 1.1921e-07 - val_accuracy: 0.1071

Epoch 00026: loss did not improve from 0.00000
Epoch 27/260
1307/1307 [ ] - 958s 733ms/step - loss: 1.1921e-07 - accuracy: 0.1041 - val_loss: 1.1921e-07 - val_accuracy: 0.1075

Epoch 00027: loss did not improve from 0.00000
Epoch 28/260
1307/1307 [ ] - 958s 733ms/step - loss: 1.1921e-07 - accuracy: 0.0996 - val_loss: 1.1921e-07 - val_accuracy: 0.0995

Epoch 00028: loss did not improve from 0.00000
Epoch 29/260
1307/1307 [ ] - 958s 733ms/step - loss: 1.1921e-07 - accuracy: 0.0992 - val_loss: 1.1921e-07 - val_accuracy: 0.0937

Epoch 00029: loss did not improve from 0.00000
Epoch 30/260
1307/1307 [ ] - 958s 733ms/step - loss: 1.1921e-07 - accuracy: 0.0985 - val_loss: 1.1921e-07 - val_accuracy: 0.1015

Epoch 00030: loss did not improve from 0.00000
Epoch 31/260
1307/1307 [ ] - 958s 733ms/step - loss: 1.1921e-07 - accuracy: 0.1025 - val_loss: 1.1921e-07 - val_accuracy: 0.0973

Epoch 00031: loss did not improve from 0.00000
Epoch 32/260
1307/1307 [ ] - 958s 733ms/step - loss: 1.1921e-07 - accuracy: 0.0972 - val_loss: 1.1921e-07 - val_accuracy: 0.1082

Epoch 00032: loss did not improve from 0.00000
Epoch 33/260
1307/1307 [ ] - 958s 733ms/step - loss: 1.1921e-07 - accuracy: 0.0989 - val_loss: 1.1921e-07 - val_accuracy: 0.0937

Epoch 00033: loss did not improve from 0.00000
Epoch 34/260
1307/1307 [ ] - 958s 733ms/step - loss: 1.1921e-07 - accuracy: 0.0995 - val_loss: 1.1921e-07 - val_accuracy: 0.0986

Epoch 00034: loss did not improve from 0.00000
Epoch 35/260
1307/1307 [ ] - 958s 733ms/step - loss: 1.1921e-07 - accuracy: 0.0995 - val_loss: 1.1921e-07 - val_accuracy: 0.1008

Epoch 00035: loss improved from 0.00000 to 0.00000, saving model to F:\Data\QQCT2025\RSICPRJ\model\model.hdf5
Epoch 36/260
1307/1307 [ ] - 958s 733ms/step - loss: 1.1921e-07 - accuracy: 0.1017 - val_loss: 1.1921e-07 - val_accuracy: 0.0910

Epoch 00036: loss did not improve from 0.00000
Epoch 37/260
1307/1307 [ ] - 958s 733ms/step - loss: 1.1921e-07 - accuracy: 0.1003 - val_loss: 1.1921e-07 - val_accuracy: 0.0993

Epoch 00037: loss did not improve from 0.00000
Epoch 38/260
1307/1307 [ ] - 958s 733ms/step - loss: 1.1921e-07 - accuracy: 0.1041 - val_loss: 1.1921e-07 - val_accuracy: 0.0965

Epoch 00038: loss did not improve from 0.00000
Epoch 39/260
1307/1307 [ ] - 958s 733ms/step - loss: 1.1921e-07 - accuracy: 0.1024 - val_loss: 1.1921e-07 - val_accuracy: 0.0895

Epoch 00039: loss did not improve from 0.00000
Epoch 40/260
1307/1307 [ ] - 958s 733ms/step - loss: 1.1921e-07 - accuracy: 0.0992 - val_loss: 1.1921e-07 - val_accuracy: 0.0962

Epoch 00040: loss did not improve from 0.00000
Epoch 41/260
1307/1307 [ ] - 958s 733ms/step - loss: 1.1921e-07 - accuracy: 0.1041 - val_loss: 1.1921e-07 - val_accuracy: 0.0978

Epoch 00041: loss did not improve from 0.00000
Epoch 42/260
1307/1307 [ ] - 958s 733ms/step - loss: 1.1921e-07 - accuracy: 0.1008 - val_loss: 1.1921e-07 - val_accuracy: 0.0985

Epoch 00042: loss did not improve from 0.00000
Epoch 43/260
1307/1307 [ ] - 958s 733ms/step - loss: 1.1921e-07 - accuracy: 0.1034 - val_loss: 1.1921e-07 - val_accuracy: 0.1030

Epoch 00043: loss did not improve from 0.00000
Epoch 44/260
1307/1307 [ ] - 958s 733ms/step - loss: 1.1921e-07 - accuracy: 0.1032 - val_loss: 1.1921e-07 - val_accuracy: 0.1070

Epoch 00044: loss did not improve from 0.00000
Epoch 45/260
653/1307 [=>] - ETA: 1:17:37 - loss: 1.1921e-07 - accuracy: 0.1055

我的解答思路和尝试过的方法

深度学习半路出家 初次开展这么多样本的训练
昨天 开始训练时就出现过acc较低的情况 开始几轮0.1的acc 重启了几次 把lr从2e-5调整为5e-5
训练acc前几轮达到0.8左右 之后继续训练
今天发现 在18轮出现loss和acc陡然下降 之后 acc维持在0.1左右 才训练至45轮

我想要达到的结果

请问loss和acc陡然下降是什么原因?
怎样调整或检查排除才能让loss 和acc正常 曲线拟合至最佳状态?
就遥感影像地类要素分类来说 基础样本旋转3次得到10000左右训练样本是否可用 或还有哪些增强方式可用?

  • 写回答

9条回答 默认 最新

  • herosunly Python领域优质创作者 2022-08-17 12:56
    关注
    1. 数据增强使用CutMix和Cutout,其中CutMix就是将一部分区域cut掉但不填充0像素而是随机填充训练集中的其他数据的区域像素值,分类结果按一定的比例分配;Cutout:随机的将样本中的部分区域cut掉,并且填充0像素值,分类的结果不变
    2. 激活函数可尝试使用swish或者mish
    3. 训练trick可使用学习率预热+学习率余弦衰减,建议先试预热,然后试两者合起来。
      学习率预热的代码为:
    # 在10个epoch学习率从5e-8逐渐提升为5e-7
    from tensorflow.keras.callbacks import LearningRateScheduler
    
    lr_schedule = [10]
    
    def schedule(epoch_idx):
        if epoch_idx  < lr_schedule[0]:
            return 5e-7 / 10 * (epoch_idx+1)
       
        return 5e-7
    
    scheduler = LearningRateScheduler(schedule=schedule)
    
    model.fit(X_train, X_train_label,
                      validation_data=(X_val, X_val_label),
                      epochs=100, batch_size=64,
                      shuffle=True,
                      callbacks=[scheduler] #回调代码
    

    两者合起来的代码为:

    from tensorflow.keras.callbacks import LearningRateScheduler
    
    lr_schedule = [10]
    
    def schedule(epoch_idx):
        if epoch_idx < lr_schedule[0]:
            return 5e-7 / 10 * (epoch_idx+1)
        else:
            t = (epoch_idx - 10) * math.pi / 100 
            return  1/2 * (1 + math.cos(t)) * 0.1
    
    scheduler = LearningRateScheduler(schedule=schedule)
    
    本回答被题主选为最佳回答 , 对您是否有帮助呢?
    评论 编辑记录
查看更多回答(8条)

报告相同问题?

问题事件

  • 系统已结题 8月28日
  • 已采纳回答 8月20日
  • 赞助了问题酬金40元 8月17日
  • 赞助了问题酬金60元 8月17日
  • 展开全部

悬赏问题

  • ¥200 csgo2的viewmatrix值是否还有别的获取方式
  • ¥15 Stable Diffusion,用Ebsynth utility在视频选帧图重绘,第一步报错,蒙版和帧图没法生成,怎么处理啊
  • ¥15 请把下列每一行代码完整地读懂并注释出来
  • ¥15 pycharm运行main文件,显示没有conda环境
  • ¥15 寻找公式识别开发,自动识别整页文档、图像公式的软件
  • ¥15 为什么eclipse不能再下载了?
  • ¥15 编辑cmake lists 明明写了project项目名,但是还是报错怎么回事
  • ¥15 关于#计算机视觉#的问题:求一份高质量桥梁多病害数据集
  • ¥15 特定网页无法访问,已排除网页问题
  • ¥50 如何将脑的图像投影到颅骨上