The weights that need to be quantified should be on the CUDA device

本地部署chatglm3出现问题：
The weights that need to be quantified should be on the CUDA device
能否有直接的答案呢，我在这个领域是一片白纸，现在主要就是需要能本地把chatglm3-6b启动起来

写回答
好问题 0 提建议
关注问题
分享
邀请回答
编辑收藏删除
收藏举报

17条回答默认最新

关注

码龄粉丝数原力等级 --

被采纳

被点赞

采纳率
专家-赤兔[在线] 优质创作者: 编程框架技术领域 2024-04-02 20:22
关注
引自免费微信小程序：皆我百晓生

非常抱歉给您带来了困扰，但根据您的描述，您似乎是在寻找关于Python在CUDA上的优化方法。这可能涉及到使用PyTorch等深度学习库。

物理层面：首先，确保您的GPU具有足够的内存，并且CUDA环境已经正确设置。其次，可以尝试降低模型大小，这样可以通过减少CPU计算负担来提高GPU性能。最后，也可以尝试在局部数据上训练模型，而不是全局数据，以利用GPU并行计算的优势。
算法层面：考虑是否有可能将部分计算任务移到GPU上。例如，在卷积神经网络（CNN）中，一些计算密集型任务，如模型训练，可以在GPU上得到很好的加速效果。此外，对于计算机视觉等领域的应用，可以考虑使用专用的硬件加速器，如NVIDIA的Tensor Cores和OpenCL。
并行编程：如果CUDA无法满足需求，您可以考虑使用多线程或多进程来实现GPU并行计算。但是要注意处理好同步和通信问题，避免出现数据竞争等问题。

以下是一个简单的例子，展示如何在单个GPU上运行一个简单的神经网络模型：

import torch import torch.nn as nn # 定义模型结构 class Net(nn.Module): def __init__(self): super(Net, self).__init__() self.conv1 = nn.Conv2d(3, 6, kernel_size=3) self.pool = nn.MaxPool2d(kernel_size=2) self.conv2 = nn.Conv2d(6, 16, kernel_size=3) self.fc1 = nn.Linear(16 * 8 * 8, 50) self.fc2 = nn.Linear(50, 10) def forward(self, x): x = self.pool(F.relu(self.conv1(x))) x = self.pool(F.relu(self.conv2(x))) x = x.view(-1, 16 * 8 * 8) x = F.relu(self.fc1(x)) x = self.fc2(x) return x net = Net() device = torch.device("cuda" if torch.cuda.is_available() else "cpu") net.load_state_dict(torch.load('model.pth')) output = net(input).squeeze() print(output.shape)

这个例子使用了PyTorch内置的nn.Module类来创建模型。forward()函数实现了模型的基本运算，然后通过torch.save()将模型的状态保存到文件中。注意，这个示例假设了模型的输入为3x3的卷积核矩阵，输出为50维的向量。

如果您能提供更多关于您的设备的信息，我会更好地为您提供建议。
解决无用
评论打赏
分享
举报编辑记录

评论

按下Enter换行，Ctrl+Enter发表内容

报告相同问题？

关注问题

RuntimeError: All input tensors must be on the same device. Received cuda:0 and cpu 问题解决方案
2022-01-24 22:54

将大炮组装的博客解决前 self.fc2.weight = nn.Parameter(self.fc2.weight * self.zeros) 解决后 device = torch.device(‘cuda:0’...self.fc2.weight = nn.Parameter(self.fc2.weight.to(device) * self.zeros.to(device)).to(device)
WIN10 anaconda 安装 tensorflow-gpu不出错的最佳解决办法（不在系统安装CUDA）
2018-12-01 14:21

星海一叶舟的博客来源：https://www.pugetsystems.com/labs/hpc/The-Best-Way-to-Install-TensorFlow-with-GPU-Support-on-Windows-10-Without-Installing-CUDA-1187/ 由于使用根据百度出来的WIN10下面使用anaconda和CUDA安装...
【Pytorch】成功解决RuntimeError: Attempting to deserialize object on a CUDA device but torch.cuda.is_avail
2024-06-16 14:29

高斯小哥的博客遭遇PyTorch模型加载到CUDA设备失败？别担心，这篇博客教你轻松解决！从问题背景入手，深入诊断原因，提供实用解决方案️。不仅如此，还举一反三，拓展到更广泛的深度学习实践。看完你就能避免类似错误，让模型加载...
The `weights` argument should be either random initialization or pre-training on ImageNet
2019-07-03 10:09

YYLin-AI的博客首先看一下我的源码： from keras.applications.vgg16 ...vgg16_weights = '../Dataset/Weight_for_Cats_VS_Dogs/vgg16_weights_tf_dim_ordering_tf_kernels.h5' vgg16_model = VGG16(weights=vgg16_weights) ...
AssertionError: If capturable=False, state_steps should not be CUDA tensors.报错的一个解决方法。
2023-03-20 13:57

___DayDayUp__的博客我出现这个问题的原因，是使用了pytorch1.12版本（可能是版本的bug？），之后换成了1.10版本就可以了。所以如果其他的包不必须依赖1.12版本，可以尝试降低pytorch版本来解决这个问题。在跑代码的时候忽然遇到了这个...
解决报错ValueError: The current `device_map` had weights offloaded to the disk. Please provide an `offlo
2024-05-28 15:12

山山而川_R的博客（注：有cuda，因为装环境的时候cuda可能没有正确安装，不能直接gpu加载导致电脑运行大模型的模型文件时，没有找到gpu，电脑只能加载cpu导致内存不够）一种方法如果是你的电脑没有gpu。确保你电脑正确安装cuda。如果...
RuntimeError: Attempting to deserialize object on a CUDA device but torch.cuda.is_available()
2022-08-06 11:40

hkx11111的博客 yolov7调试过程CUDA报错
Attempted to set the storage of a tensor on device “cpu“ to a storage on different device “cuda:0“.
2022-07-13 19:12

sanc92135的博客 Attempted to set the storage of a tensor on device "cpu" to a storage on different device "cuda:0". 问题解决方案
【Pytorch】 Attempting to deserialize object on a CUDA device but torch.cuda.is_available()
2020-08-18 10:50

Vincent__Lai的博客 torch.loads
On-Device Neural Net Inference with Mobile GPUs
2022-11-25 22:40

Yongqiang Cheng的博客 On-Device Neural Net Inference with Mobile GPUs
The current `device_map` had weights offloaded to the disk. Please provide an `offload_folder` for
2024-04-07 11:14

code_idea的博客 ValueError: The current `device_map` had weights offloaded to the disk. Please provide an `offload_folder` for them. Alternatively, make sure you have `safetensors` installed if the model you are ...
问题解决：Input type (torch.cuda.FloatTensor) and weight type (torch.FloatTensor) should be the
2021-02-04 11:03

ys-li的博客 pytorch训练过程中出现如题错误的解决方案常规解决方案从报错问题描述中可以找到错误原因输入的数据类型为torch.cuda.FloatTensor，...model = model.to('cuda') model.cuda() model.to('cuda') 上面四行任选一，
Attempting to deserialize object on a CUDA device but torch.cuda.is_available() is False
2020-03-24 18:10

AI算法网奇的博客 Attempting to deserialize object on a CUDA device but torch.cuda.is_available() is False 此时改为： torch.load("0.9472_0048.weights",map_location='cpu')
RuntimeError: Attempting to deserialize object on a CUDA device but torch.cuda.is_available() is Fal
2020-07-19 11:10

城俊BLOG的博客在跑Pytorch模型测试代码时报错: RuntimeError: Attempting to deserialize object on a CUDA device but torch.cuda.is_available() is False. If you are running on a CPU-only machine, please use torch.load ...
PyTorch错误解决方案及技巧RuntimeError: Attempting to deserialize object on CUDA device 2 but torch.cuda.devic
2020-04-05 19:34

CVAIDL的博客报错：RuntimeError: Attempting to deserialize object on CUDA device 2 but torch.cuda.device_count() is 1 原因：在使用Pytorch加载模型时报错。加载的模型是用两个GPU训练的，而加载模型的电脑只有一个GPU...
If capturable=False, state_steps should not be CUDA tensors
2022-08-04 16:59

QT-Smile的博客 If capturable=False, state_steps should not be CUDA tensors
没有解决我的问题, 去提问

问题事件

关注

码龄粉丝数原力等级 --

被采纳

被点赞

采纳率
已结题（查看结题原因） 4月9日
关注

码龄粉丝数原力等级 --

被采纳

被点赞

采纳率
修改了问题 4月7日
关注

码龄粉丝数原力等级 --

被采纳

被点赞

采纳率
赞助了问题酬金15元 4月3日
关注

码龄粉丝数原力等级 --

被采纳

被点赞

采纳率
创建了问题 4月2日

The weights that need to be quantified should be on the CUDA device

17条回答 默认 最新

问题事件

17条回答默认最新