尘世壹俗人 2024-06-20 00:15 采纳率: 85.2%
浏览 7
已结题

hadoop3.x跑测试任务applicationmaster刚启动就挂了

刚搭建了一套hadoop3.3.6的测试集群,遇到一个怪事,格式化没有问题,namenode、datanode、yarn的页面都能正常打开,测试hadoop等一系列自带命令时也没啥问题。

但是在提交一个MR跑一下,想看看是否能正常提交MR任务的时候,发现,任务可以正常到yarn里面,但是任务从队列里面出来task的applicationmaster被下放到datanode节点启动之后就会立马宕掉,完整日志如下

[root@hdp4 hadoop]# hadoop jar /opt/hadoop-3.3.6/share/hadoop/mapreduce/hadoop-mapreduce-examples-3.3.6.jar wordcount /input /output
2024-06-20 00:02:49 INFO  JobResourceUploader:907 - Disabling Erasure Coding for path: /hisdata/staging/root/.staging/job_1718812910805_0001
  2024-06-20 00:02:50 INFO  FileInputFormat:300 - Total input files to process : 1
  2024-06-20 00:02:50 INFO  JobSubmitter:202 - number of splits:1
  2024-06-20 00:02:50 INFO  JobSubmitter:298 - Submitting tokens for job: job_1718812910805_0001
  2024-06-20 00:02:50 INFO  JobSubmitter:299 - Executing with tokens: []
  2024-06-20 00:02:50 INFO  Configuration:2854 - resource-types.xml not found
  2024-06-20 00:02:50 INFO  ResourceUtils:476 - Unable to find 'resource-types.xml'.
  2024-06-20 00:02:51 INFO  YarnClientImpl:338 - Submitted application application_1718812910805_0001
  2024-06-20 00:02:51 INFO  Job:1682 - The url to track the job: http://hdp5:8088/proxy/application_1718812910805_0001/
  2024-06-20 00:02:51 INFO  Job:1727 - Running job: job_1718812910805_0001
  2024-06-20 00:03:17 INFO  Job:1748 - Job job_1718812910805_0001 running in uber mode : false
  2024-06-20 00:03:17 INFO  Job:1755 -  map 0% reduce 0%
  2024-06-20 00:03:17 INFO  Job:1768 - Job job_1718812910805_0001 failed with state FAILED due to: Application application_1718812910805_0001 failed 2 times due to AM Container for appattempt_1718812910805_0001_000002 exited with  exitCode: 1
Failing this attempt.Diagnostics: [2024-06-20 00:03:17.049]Exception from container-launch.
Container id: container_e19_1718812910805_0001_02_000001
Exit code: 1

[2024-06-20 00:03:17.078]Container exited with a non-zero exit code 1. Error file: prelaunch.err.
Last 4096 bytes of prelaunch.err :
Last 4096 bytes of stderr :
log4j:WARN No appenders could be found for logger (org.apache.hadoop.mapreduce.v2.app.MRAppMaster).
log4j:WARN Please initialize the log4j system properly.
log4j:WARN See http://logging.apache.org/log4j/1.2/faq.html#noconfig for more info.


[2024-06-20 00:03:17.081]Container exited with a non-zero exit code 1. Error file: prelaunch.err.
Last 4096 bytes of prelaunch.err :
Last 4096 bytes of stderr :
log4j:WARN No appenders could be found for logger (org.apache.hadoop.mapreduce.v2.app.MRAppMaster).
log4j:WARN Please initialize the log4j system properly.
log4j:WARN See http://logging.apache.org/log4j/1.2/faq.html#noconfig for more info.


For more detailed output, check the application tracking page: http://hdp5:8088/cluster/app/application_1718812910805_0001 Then click on links to logs of each attempt.
. Failing the application.
  2024-06-20 00:03:17 INFO  Job:1773 - Counters: 0

我排查了namenode的状态发现是正常的,也没有在安全模式下,我也检查了配置文件没有发现出入,要不然格式化也过不了,还排查了网络通信也是ping的通的,最后我想是不是资源不够的问题,可是我尝试了把集群资源给到了96G/45C,甚至关闭了检查都是老样子在MapReduce刚开始要运行之后applicationmaster就会宕掉,想不通怎么回事,有什么建议点吗,hadoop版本是3.3.6

img

  • 写回答

2条回答 默认 最新

  • 尘世壹俗人 2024-06-21 16:18
    关注

    解决了,我把配置文件从新写了一份,然后就没有问题了,应该是配置文件中的格式,或者是某个很不起眼的字符影响了

    本回答被题主选为最佳回答 , 对您是否有帮助呢?
    评论
查看更多回答(1条)

报告相同问题?

问题事件

  • 已结题 (查看结题原因) 6月21日
  • 已采纳回答 6月21日
  • 创建了问题 6月20日

悬赏问题

  • ¥15 无法输出helloworld
  • ¥15 高通uboot 打印ubi init err 22
  • ¥20 PDF元数据中的XMP媒体管理属性
  • ¥15 R语言中lasso回归报错
  • ¥15 网站突然不能访问了,上午还好好的
  • ¥15 有没有dl可以帮弄”我去图书馆”秒选道具和积分
  • ¥15 semrush,SEO,内嵌网站,api
  • ¥15 Stata:为什么reghdfe后的因变量没有被发现识别啊
  • ¥15 振荡电路,ADS仿真
  • ¥15 关于#c语言#的问题,请各位专家解答!