CUDA kernel函数内for循环引发unspecific launch failure

    我想测试一下CUDA kernel函数中的for循环的循环次数可不可以无限大。于是我写了一个简单的代码如下：
            于是当size像下面的一千万那么大的时候，程序不正确了。显示调用核函数失败 unspecific launch failure
            我用nsight带的工具测试了一下，并不是寄存器超了或者内存分配失败或溢出，只是随着size的增加，每个warp的指令数IPW(Instructions Per Warp)会增加。是不是错误与这个有关？但是下面的程序代码看起来没有什么错误，如果换成C++的话循环多少次都可以，是不是CUDA对kernel有什么限制？我想知道它错误的原因。
            系统:Win10 pro 64bit
            IDE:Visual Studio 2015 community
            CUDA 8.0
            GPU: GTX860M 4GB

#include "cuda_runtime.h"
#include "device_launch_parameters.h"
#include "stdlib.h"
#include
using namespace std;

global void Kernel(double *M_gpu, int size)
{
for (int i = 0; i < size; i++)
{
M_gpu[i] = i / 2 + 6;
}
}

int main()
{
cudaError_t cudaStatus;
//分配CPU空间
int size = 10000000;
double M = (double *)malloc(sizeof(double)*size);
//分配GPU空间
double *M_gpu;
cudaStatus = cudaMalloc((void*)&M_gpu, sizeof(double)*size);
if (cudaStatus != cudaSuccess)
{
cout << "分配CPU空间失败！" << cudaGetErrorString(cudaStatus) << endl; getchar(); exit(0);
}
//调用核函数
Kernel << > > (M_gpu, size);
cudaDeviceSynchronize();
cudaStatus = cudaGetLastError();
if (cudaStatus != cudaSuccess)
{
cout << "调用核函数失败！" << cudaGetErrorString(cudaStatus) << endl; getchar(); exit(0);
}
//将数据从GPU拷贝回CPU
cudaStatus = cudaMemcpy(M, M_gpu, sizeof(double)*size, cudaMemcpyDeviceToHost);
if (cudaStatus != cudaSuccess)
{
cout << "拷贝数据失败！" << cudaGetErrorString(cudaStatus) << endl; getchar(); exit(0);
}
//END
cout << "Success!" << endl;
getchar();
}

展开全部

写回答
好问题 0 提建议
追加酬金
关注问题
分享
邀请回答
编辑收藏删除结题
收藏举报

3条回答默认最新

关注

码龄粉丝数原力等级 --

被采纳

被点赞

采纳率
yangbo50304 2017-02-16 17:27
关注
用代码测试了下，跟release没有关系，上网搜索了下，好像是kernel运行超时导致kernel直接退出了。
你用nsight打开option，修改下General->WDDM TDR Display 设置大一点试试。默认是2s应该。

本回答被题主选为最佳回答 , 对您是否有帮助呢?

解决无用
评论打赏
分享
举报
编辑

预览
轻敲空格完成输入
显示为

卡片

标题

链接
评论

按下Enter换行，Ctrl+Enter发表内容

查看更多回答(2条)

编辑

预览

报告相同问题？

关注问题

Go中的不安全指针：函数调用结束会杀死数组
2015-04-11 18:20

回答 1 已采纳 First of all, unsafe is usually a bad idea. So is reflection, but unsafe is at least an order of m
编辑2 - PHP函数来清理和转义动态MySQL中使用的任何变量 - 我的代码 mysql php
2010-11-18 11:00

回答 3 已采纳 Ok, since you've edited your question and I better understand what you're trying to do, let me say
为什么我不能在我的Wordpress主题的PHP模板文件中编辑HTML html php
2016-12-23 13:44

回答 1 已采纳 The file you are working in Is for Displays all of the <head> section and everything up till
RuntimeError: CUDA error: unspecified launch failure
2021-04-12 11:10

sinat_16423171的博客 pytorch训练时报错，是由于内存占用过多，占用率到99%了，把batch_size改小点就可以了，使得占用在90%左右就好
PHP漏洞（XSS，...）>用户输入/ url注入何时实际上会造成伤害？ php xss
2015-02-06 03:50

回答 2 已采纳 It depends on what you're going to do with that file. The username field can pass something that w
学说级联实体清算 php symfony
2014-04-02 09:13

回答 3 已采纳 As of the last commit, the issue is fixed. Now cascade clearing works like a charm for me. Thanks
在Eloquent中具有一对多关系的多语言网站 php
2014-01-25 13:38

回答 1 已采纳 It depends on what you exactly want. To get a Product with all of his productlanguage: //This re
Unspecific-Hardware-Configuration.zip_单片机开发_Others_
2021-08-11 04:54

标题中的“Unspecific-Hardware-Configuration.zip”是一个与单片机开发相关的压缩文件，而“Others”标签可能意味着其中包含了不特定于某一特定硬件或通用性的内容。描述中提到的“SCE_ZH_011-100 Unspecific ...
是否通过HTTP_REFERER传递了URL的查询段？ [原文如此] http php
2014-01-30 03:20

回答 1 已采纳 From the tests that I have done on my end, local and across multiple servers, the parameters of th
循环异常 java,Java增强的for循环（对于每个循环）抛出异常
2021-02-21 04:44

weixin_39647977的博客 I was recommended to use a List list = new ArrayList to collect and later remove a number of unspecific JLabel images from my JPanelprivate List cardImages = new ArrayList();public void addCardImage(B...
java 函数参数为空,Java构造函数样式（检查参数不为空）
2021-02-16 04:25

weixin_39864738的博客 What are the best practices if you have a class which accepts some parameters but none of them are allowed to be ...The following is obvious but the exception is a little unspecific:public class Som...
Rectifier (neural networks) - 整流函数
2019-06-24 15:15

Yongqiang Cheng的博客线性整流函数 / 线性修正单元 (Rectified Linear Unit，ReLU) 是一种人工神经网络中常用的激活函数 (activation function)，通常指代以斜坡函数及其变种为代表的非线性函数。常用的线性整流函数有斜坡函数 f(x)=max...
Android 13.0 禁止弹出系统simlock的锁卡弹窗功能实现
2024-09-06 16:45

安卓兼职framework应用工程师的博客 for (int i = size - 1; i >= 0; i--) { try { mKeyguardStateCallbacks.get(i).onSimSecureStateChanged(simPinSecure); } catch (RemoteException e) { Slog.w(TAG, "Failed to call onSimSecureStateChanged", e...
【openh264】H.264 常用选项解释说明
2022-03-21 16:07

等风来不如迎风去的博客 h264-advanced-guide openh264 给出的profile ID /** * @brief Enumerate the type of profile id */ typedef enum { PRO_UNKNOWN = 0, PRO_BASELINE = 66, PRO_MAIN = 77, PRO_EXTENDED = 88, ... PRO_HIGH4.
FFmpeg[32] - x264 [error]: high422 profile doesn‘t support lossless
2021-11-14 04:13

Data-Mining的博客问题今天在使用 ffmpeg 的过程中，又遇到了一个问题，由于播放器的限制，现在需要把 x264 的 profile 调低一点。但是，把 profile 调整成baseline、main、high、high10、high422 等 level 之后，都会遇到如下报错...
Notes Fifteenth Day-渗透攻击-红队-内部信息搜集
2020-10-01 14:07

大余xiyou的博客信息获取 2.2.10.2 攻击MSSQL--PowerUpSQL 介绍发现MSSQL实例获取MSSQL信息测试口令持久性获取域信息防御方案 2.2.10.3 如何利用Mysql安全特性发现漏洞 Mysql权限 load_file函数用法 Mysql版本差异成功利用...
java 参数为空异常,Java构造函数样式：检查参数不为null
2021-02-18 20:05

无1234的博客 What are the best practices if you have a class which accepts some parameters but none of them are allowed to be ...The following is obvious but the exception is a little unspecific:public class Som...
渗透之——Nmap+Zenmap+Amap+Zmap
2018-11-18 13:21

冰河的博客转载请注明出处：https://blog.csdn.net/l1028386804/article/details/84204992 Nmap 第一部分 Nmap基础一、Nmap功能基本功能有三个： 1.... 2.... 3.... 探测网路上的主机，例如列出相应TCP和ICMP请求，i...
open_output_file函数
2016-11-02 08:45

sidumqz的博客 open_output_file()函数
CnOpenData 美国公司员工评价表
2024-07-19 02:03

CnOpenData_wj的博客 employed 73237358 2023/2/2 0:27 4 APPROVE NEUTRAL 4 4 4 4 POSITIVE 4 4 是 0 REGULAR Supply Chain Analyst It is a significant experience to launch a line of supplements and establish strategic ...
没有解决我的问题, 去提问

悬赏问题

¥20 谁刷目标页面的uv记录器上数据，数据只记录跳转的数值
¥30 数据库软件的安装方法
¥15 一道以太网数据传输题
¥15 python 下载群辉文件
¥50 代码还没怎么运行但是需要代码功能调用数据
¥15 vue请求不到数据，返回状态200，数据为html
¥15 用白鹭引擎开发棋牌游戏的前端为什么这么难找
¥35 哪位专业人士知道这是什么原件吗？哪里可以买到?
¥15 关于#c##的问题：treenode反序列化后获取不到上一节点和下一节点，Fullpath和Handle报错
¥15 一部手机能否同时用不同的app进入不同的直播间？

CUDA kernel函数内for循环引发unspecific launch failure

3条回答 默认 最新

悬赏问题

3条回答默认最新