Slurm, slurmctld status failed
求大神帮忙看看slurmctld 状态failed, 昨天安装是ok的,今天就不行了。。。。。
slurmd 和slurmdbd 状态是ok的
**
**slurmctld.service - Slurm controller daemon
Loaded: loaded (/etc/systemd/system/slurmctld.service; enabled; vendor preset: enabled)
Active: failed (Result: exit-code) since Sat 2023-10-14 06:39:17 UTC; 1h 20min ago
Process: 6988 ExecStart=/opt/slurm/23.02.6/sbin/slurmctld -D -s $SLURMCTLD_OPTIONS (code=exited, status=1/FAILURE)
Main PID: 6988 (code=exited, status=1/FAILURE)
Oct 14 06:39:17 DUT7152ATSM systemd[1]: Started Slurm controller daemon.
Oct 14 06:39:17 DUT7152ATSM slurmctld[6988]: slurmctld: error: Configured MailProg is invalid
Oct 14 06:39:17 DUT7152ATSM slurmctld[6988]: slurmctld: slurmctld version 23.02.6 started on cluster cool
Oct 14 06:39:17 DUT7152ATSM slurmctld[6988]: slurmctld: fatal: Can not recover assoc_usage state, incompatible version, got 8704 need >= 9472 <= 9984, start with '-i' to ignore this. Warning>
Oct 14 06:39:17 DUT7152ATSM systemd[1]: slurmctld.service: Main process exited, code=exited, status=1/FAILURE
Oct 14 06:39:17 DUT7152ATSM systemd[1]: slurmctld.service: Failed with result 'exit-code'.****