孟婆,来碗汤啊
2017-02-08 12:35
采纳率: 66.7%
浏览 1.8k
已采纳

使用MPI2中的并行IO,在对文件进行写操作时,设置视口时出错

使用多个进程对文件进行操作,0进程不参与对文件的操作,但是在组调用情况下也需要取调用一下设置视口与写函数。
以下是程序:
int main(int argc, char *argv[])
{
int myid = 0; // 当前进程的编号
int numprocs = 0; // 当前进程的名称
int namelen = 0;
char processor_name[MPI_MAX_PROCESSOR_NAME];
//MPI初始化
MPI_Init(&argc,&argv);
MPI_Comm_rank(MPI_COMM_WORLD,&myid);
MPI_Comm_size(MPI_COMM_WORLD,&numprocs);
MPI_Get_processor_name(processor_name,&namelen);
int ret = 0;
QString GatherName = "./data_file";
int nshot = 0;
if( myid == 0 )
nshot = 0;
else
nshot = 1000;
int dataType= 2100 * 4;
int ntrmax = 27;
MPI_Offset DataSize = nshot * ntrmax * dataType;
MPI_File fh;
ret = MPI_File_open( MPI_COMM_WORLD, GatherName.toLatin1().data(), MPI_MODE_WRONLY|MPI_MODE_CREATE, MPI_INFO_NULL, &fh );
if( 0 != ret )
return -1;
MPI_File_set_size(fh, DataSize);
float *DataBuf = new float[2100*nshot];
for( int iData = 0; iData < 2100*nshot; iData ++ )
DataBuf[iData] = myid+1;
int *arr = new int[nshot];
for( int itr = 0; itr < nshot; itr ++ )
{
arr[itr] = itr * ntrmax + myid;
}
int *Array_of_blockLengthData = new int[nshot];
MPI_Aint *Array_of_displament_Data = new MPI_Aint[nshot];
for( int itr = 0; itr < nshot; itr ++ )
{
Array_of_blockLengthData[itr] = dataType;
Array_of_displament_Data[itr] = (MPI_Aint)(arr[itr] * dataType);
}
MPI_Status stData;
MPI_Datatype DatafType;
ret = MPI_Type_create_hindexed( nshot, Array_of_blockLengthData, Array_of_displament_Data, MPI_CHAR, &DatafType);
if( 0 == ret )
{
ret = MPI_Type_commit( &DatafType );
if( 0 == ret )
{
int size = 0;
MPI_Type_size(DatafType, &size);
qDebug() << "Write Head...myid is:" << myid << "\t Head file type size is:" << size;
ret = MPI_File_set_view( fh, 0, MPI_CHAR, DatafType,"native",MPI_INFO_NULL);
if( 0 == ret )
{
qDebug() << "myid is:" << myid << "set view success.";
MPI_File_write_all(fh, DataBuf, nshot*dataType, MPI_CHAR, &stData);
}
else
qDebug() << "set view error. ret is:" << ret;
}
}
MPI_File_close(&fh);
delete [] arr; arr = NULL;
delete [] DataBuf; DataBuf = NULL;
delete [] Array_of_blockLengthData; Array_of_blockLengthData = NULL;
delete [] Array_of_displament_Data; Array_of_displament_Data = NULL;
qDebug() << "SUCCESS";
MPI_Finalize();
return 0;
}

出现的错误为,程序运行到MPI_File_set_view函数时报错:
mpirun noticed that process rank 0 with PID 7008 on node node0 exited on signal 11 (Segmentation fault).

当所有进程均参与对文件操作时,数据量很小的情况下能够成功运行结束,数据量太大会出现错误:
Error in ADIOI_Calc_aggregator(): rank_index(1) >= fd->hints->cb_nodes (1) fd_size=240822704 off=240861600
Error in ADIOI_Calc_aggregator(): rank_index(1) >= fd->hints->cb_nodes (1) fd_size=240822704 off=240870000
[node0:07430] 1 more process has sent help message help-mpi-api.txt / mpi-abort
[node0:07430] Set MCA parameter "orte_base_help_aggregate" to 0 to see all help / error messages

  • 写回答
  • 好问题 提建议
  • 关注问题
  • 收藏
  • 邀请回答

2条回答 默认 最新

  • 孟婆,来碗汤啊 2017-02-10 03:33
    已采纳

    原因已经找到,数据量大出现错误的原因是 MPI_File_set_view函数在设置视口的跨度不能超过2G,超过2G就会出错。
    出现的错误为,程序运行到MPI_File_set_view函数时报错:
    mpirun noticed that process rank 0 with PID 7008 on node node0 exited on signal 11 (Segmentation fault). 原因:设置视口中的文件类型的size不能为0.

    还有一个原因是因为mpi版本本身bug

    已采纳该答案
    评论
    解决 无用
    打赏 举报
  • 查看更多回答(1条)

相关推荐 更多相似问题