wxypku 2018-07-05 02:44 采纳率: 20%
浏览 10800
已结题

mpi在多节点上的运行问题

我安转的是openmpi,用mpirun在两个节点上运行的时候出现如下错误,求助是什么原因。
shell$: /usr/local/openmpi/bin/mpiexec -np 2 --hostfile nodeinfo ./test
错误提示:
Primary job terminated normally, but 1 process returned

a non-zero exit code.. Per user-direction, the job has been aborted.

./test: error while loading shared libraries: libcudart.so.9.0: cannot open shared object file: No such file or directory

./test: error while loading shared libraries: libcudart.so.9.0: cannot open shared object file: No such file or directory

mpiexec detected that one or more processes exited with non-zero status, thus causing
the job to be terminated. The first process to do so was:

Process name: [[65150,1],0]

Exit code: 127

  • 写回答

2条回答 默认 最新

  • 桃汽宝 2021-01-02 14:59
    关注

    请问您解决了吗 我也遇到了这个问题

    评论

报告相同问题?