系统环境是win11,显卡是5060TI
tensorflow-gpu 版本是2.10,cudn版本有两个:
cuda版本一个是11.2 cudnn是8.1
cuda版本一个是13.0 cudnn 是9.16
驱动是最新,环境变量设置应该是没有问题的
13.0的cuda是为了使用新版本的pytorch
环境变量如下:


当我没有安装cuda11.2的时候他提示没有装cuda
当我装上cuda11.2,使用tensorflow训练模型时提示以下内容:
025-12-07 23:40:26.510220: W tensorflow/core/common_runtime/gpu/gpu_device.cc:2027] TensorFlow was not built with CUDA kernel binaries compatible with compute capability 12.0. CUDA kernels will be jit-compiled from PTX, which could take 30 minutes or longer.
2025-12-07 23:40:26.510885: I tensorflow/core/platform/cpu_feature_guard.cc:193] This TensorFlow binary is optimized with oneAPI Deep Neural Network Library (oneDNN) to use the following CPU instructions in performance-critical operations: AVX AVX2
To enable them in other operations, rebuild TensorFlow with the appropriate compiler flags.
2025-12-07 23:40:26.512891: W tensorflow/core/common_runtime/gpu/gpu_device.cc:2027] TensorFlow was not built with CUDA kernel binaries compatible with compute capability 12.0. CUDA kernels will be jit-compiled from PTX, which could take 30 minutes or longer.
然后会花很长时间去编译,然后才能训练,
问题是等了很长时间后 它会编译失败,然后最后训练的时候勉强算是调用cuda去计算,结果就是训练过程很慢,第一次需要900s,第二次之后任然需要60s以上,还没有使用cpu快,正常情况下每个epoch 10s 左右完成
这个问题该怎么解决?