清钟沁桐 2023-04-11 13:17 采纳率: 0%
浏览 13

将执行命令放到shell脚本后结果异常

问题描述:基于EulerOS release 2.0 (SP8)系统,直接在命令行中执行下面的命令时正常

   mpirun --allow-run-as-root -mca pml ucx -mca btl ^vader,tcp,openib,uct -x UCX_TLS=self,sm --bind-to core  -np 128 lmp_omp_daily1120_order1_gnuld_script_offset0_text -in equ.in_omp

为了实现自动化,所以想把上面的命令添加到shell 脚本中,比如 run-default.sh,然后执行sh run-default.sh后异常,提示

A requested component was not found, or was unable to be opened.  This
means that this component is either not installed or is unable to be
used on your system (e.g., sometimes this means that shared libraries
that the component requires are unable to be found/loaded).  Note that
Open MPI stopped checking at the first component that it did not find.

Host:      A191240619
Framework: pml
Component: ucx
--------------------------------------------------------------------------
--------------------------------------------------------------------------
It looks like MPI_INIT failed for some reason; your parallel process is
likely to abort.  There are many reasons that a parallel process can
fail during MPI_INIT; some of which are due to configuration or environment
problems.  This failure appears to be an internal failure; here's some
additional information (which may only be relevant to an Open MPI
developer):

  mca_pml_base_open() failed
  --> Returned "Not found" (-13) instead of "Success" (0)
--------------------------------------------------------------------------
[A191240619:41208] *** An error occurred in MPI_Init
[A191240619:41208] *** reported by process [3357081601,281470681743360]
[A191240619:41208] *** on a NULL communicator

 

  • 写回答

1条回答 默认 最新

  • 红色荷包蛋 2023-04-11 14:33
    关注

    根据错误信息显示,有一个请求的组件未找到或无法打开,可能是因为该组件未安装或无法在系统上使用。同时,MPI_INIT 失败,可能是由于配置或环境问题引起的内部故障。

    这种错误提示通常是因为在 shell 脚本中执行时,缺少了某些必要的环境变量或设置。您可以尝试在脚本中设置必要的环境变量,比如 LD_LIBRARY_PATH 和 PATH,以确保可以找到所需的库和可执行文件。

    例如,在您的 shell 脚本 run-default.sh 中,可以添加以下两行以设置 LD_LIBRARY_PATH 和 PATH 环境变量

    export LD_LIBRARY_PATH=/path/to/ucx/lib:$LD_LIBRARY_PATH
    export PATH=/path/to/mpirun/bin:$PATH
    
    

    请将 "/path/to/ucx/lib" 和 "/path/to/mpirun/bin" 替换为 ucx 库和 mpirun 可执行文件的实际路径。

    如果这些设置没有解决问题,您可以尝试添加更多的调试输出,例如在脚本中添加以下行:

    echo "LD_LIBRARY_PATH=$LD_LIBRARY_PATH"
    echo "PATH=$PATH"
    mpirun --version
    
    

    这将输出当前环境变量和 mpirun 的版本信息,有助于您进一步调试问题

    评论

报告相同问题?

问题事件

  • 创建了问题 4月11日

悬赏问题

  • ¥15 #MATLAB仿真#车辆换道路径规划
  • ¥15 java 操作 elasticsearch 8.1 实现 索引的重建
  • ¥15 数据可视化Python
  • ¥15 要给毕业设计添加扫码登录的功能!!有偿
  • ¥15 kafka 分区副本增加会导致消息丢失或者不可用吗?
  • ¥15 微信公众号自制会员卡没有收款渠道啊
  • ¥100 Jenkins自动化部署—悬赏100元
  • ¥15 关于#python#的问题:求帮写python代码
  • ¥20 MATLAB画图图形出现上下震荡的线条
  • ¥15 关于#windows#的问题:怎么用WIN 11系统的电脑 克隆WIN NT3.51-4.0系统的硬盘