新手，hadoop上运行wordcount程序报错

运行的环境是：Ubuntu14.04+hadoop2.6.1
用的是virtualBox虚拟机，然后安装了一个master和三个slave节点
hadoop是可以成功启动的，没有任何问题
在Ubuntu安装了eclipse，用java写了word count的程序，源码如下：

 package wordcount;

import java.io.IOException;
import java.util.StringTokenizer;

import org.apache.hadoop.conf.Configuration;
import org.apache.hadoop.fs.Path;
import org.apache.hadoop.io.IntWritable;
import org.apache.hadoop.io.LongWritable;
import org.apache.hadoop.io.Text;
import org.apache.hadoop.mapreduce.Job;
import org.apache.hadoop.mapreduce.Mapper;
import org.apache.hadoop.mapreduce.Reducer;
import org.apache.hadoop.mapreduce.lib.input.FileInputFormat;
import org.apache.hadoop.mapreduce.lib.output.FileOutputFormat;

/**
 * @author
 * @version 创建时间：2017年9月9日 上午8:50:51 类说明
 */
public class Wordcount {
    public static class TokenizerMapper extends Mapper<LongWritable, Text, Text, IntWritable> {
        private final static IntWritable one = new IntWritable(1);
        private Text word = new Text();

        protected void map(LongWritable key, Text value, Mapper<LongWritable, Text, Text, IntWritable>.Context context)
                throws IOException, InterruptedException {
            StringTokenizer line = new StringTokenizer(value.toString());
            while (line.hasMoreTokens()) {
                word.set(line.nextToken());
                context.write(word, one);
            }
        }
    }

    public static class IntSumReducer extends Reducer<Text, IntWritable, Text, IntWritable> {
        private IntWritable result = new IntWritable();

        protected void reduce(Text key, Iterable<IntWritable> values,
                Reducer<Text, IntWritable, Text, IntWritable>.Context context)
                throws IOException, InterruptedException {
            int sum = 0;
            for (IntWritable obj : values) {
                sum += obj.get();
            }
            result.set(sum);
            context.write(key, result);
        }
    }

    public static void main(String[] args) throws Exception {
        Configuration conf = new Configuration();

        Job job = Job.getInstance(conf, "word count");
        job.setJarByClass(Wordcount.class);

        job.setMapperClass(TokenizerMapper.class);
        job.setCombinerClass(IntSumReducer.class);
        job.setReducerClass(IntSumReducer.class);

        job.setOutputKeyClass(Text.class);
        job.setOutputValueClass(IntWritable.class);

        FileInputFormat.addInputPath(job, new Path("hdfs://master:9000/user/hduser/demo/test.txt"));
        FileOutputFormat.setOutputPath(job, new Path("hdfs://master:9000/user/hduser/demo/wordcount"));
        //FileInputFormat.addInputPath(job, new Path(args[0]));
        //FileOutputFormat.setOutputPath(job, new Path(args[1]));

        System.exit(job.waitForCompletion(true) ? 0 : 1);
    }

}

启动hadoop后，在eclipse中直接运行上面的程序，运行成功，生成了wordcount文件夹，里面有_SUCCESS文件，也有统计的结果文件

然后我想把程序打包成jar文件来运行，先把上面程序中的：

 FileInputFormat.addInputPath(job, new Path("hdfs://master:9000/user/hduser/demo/test.txt"));
        FileOutputFormat.setOutputPath(job, new Path("hdfs://master:9000/user/hduser/demo/wordcount"));

改成如下：

        FileInputFormat.addInputPath(job, new Path(args[0]));
        FileOutputFormat.setOutputPath(job, new Path(args[1]));

就是通过终端输入这两个参数
用eclipse的export打包成jar文件，后在终端输入：

 hadoop jar wordcount.jar wordcount.Wordcount hdfs://master:9000/user/hduser/demo/test.txt hdfs://master:9000/user/hduser/demo/wordcount

运行就报错了，报错情况如下：

 17/09/09 11:18:53 INFO client.RMProxy: Connecting to ResourceManager at master/192.168.56.100:8050
17/09/09 11:18:54 WARN mapreduce.JobResourceUploader: Hadoop command-line option parsing not performed. Implement the Tool interface and execute your application with ToolRunner to remedy this.
17/09/09 11:18:55 INFO input.FileInputFormat: Total input paths to process : 1
17/09/09 11:18:55 INFO mapreduce.JobSubmitter: number of splits:1
17/09/09 11:18:55 INFO mapreduce.JobSubmitter: Submitting tokens for job: job_1504926710828_0001
17/09/09 11:18:56 INFO impl.YarnClientImpl: Submitted application application_1504926710828_0001
17/09/09 11:18:56 INFO mapreduce.Job: The url to track the job: http://master:8088/proxy/application_1504926710828_0001/
17/09/09 11:18:56 INFO mapreduce.Job: Running job: job_1504926710828_0001
17/09/09 11:19:14 INFO mapreduce.Job: Job job_1504926710828_0001 running in uber mode : false
17/09/09 11:19:14 INFO mapreduce.Job:  map 0% reduce 0%
17/09/09 11:19:14 INFO mapreduce.Job: Job job_1504926710828_0001 failed with state FAILED due to: Application application_1504926710828_0001 failed 2 times due to AM Container for appattempt_1504926710828_0001_000002 exited with  exitCode: 1
For more detailed output, check application tracking page:http://master:8088/proxy/application_1504926710828_0001/Then, click on links to logs of each attempt.
Diagnostics: Exception from container-launch.
Container id: container_1504926710828_0001_02_000001
Exit code: 1
Stack trace: ExitCodeException exitCode=1: 
    at org.apache.hadoop.util.Shell.runCommand(Shell.java:538)
    at org.apache.hadoop.util.Shell.run(Shell.java:455)
    at org.apache.hadoop.util.Shell$ShellCommandExecutor.execute(Shell.java:715)
    at org.apache.hadoop.yarn.server.nodemanager.DefaultContainerExecutor.launchContainer(DefaultContainerExecutor.java:211)
    at org.apache.hadoop.yarn.server.nodemanager.containermanager.launcher.ContainerLaunch.call(ContainerLaunch.java:302)
    at org.apache.hadoop.yarn.server.nodemanager.containermanager.launcher.ContainerLaunch.call(ContainerLaunch.java:82)
    at java.util.concurrent.FutureTask.run(FutureTask.java:262)
    at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1152)
    at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:622)
    at java.lang.Thread.run(Thread.java:748)


Container exited with a non-zero exit code 1
Failing this attempt. Failing the application.
17/09/09 11:19:14 INFO mapreduce.Job: Counters: 0

去查了下日志文件，

 2017-09-09 11:18:55,869 INFO org.apache.hadoop.hdfs.StateChange: DIR* completeFile: /tmp/hadoop-yarn/staging/hduser/.staging/job_1504926710828_0001/job.xml is closed by DFSClient_NONMAPREDUCE_-1306163227_1
2017-09-09 11:18:59,502 INFO org.apache.hadoop.hdfs.server.blockmanagement.CacheReplicationMonitor: Rescanning after 30000 milliseconds
2017-09-09 11:18:59,503 INFO org.apache.hadoop.hdfs.server.blockmanagement.CacheReplicationMonitor: Scanned 0 directive(s) and 0 block(s) in 1 millisecond(s).
2017-09-09 11:19:12,241 INFO org.apache.hadoop.ipc.Server: IPC Server handler 2 on 9000, call org.apache.hadoop.hdfs.protocol.ClientProtocol.getBlockLocations from 192.168.56.102:53610 Call#7 Retry#0: java.io.FileNotFoundException: File does not exist: /tmp/hadoop-yarn/staging/hduser/.staging/job_1504926710828_0001/job_1504926710828_0001_1.jhist
2017-09-09 11:19:12,293 INFO org.apache.hadoop.ipc.Server: IPC Server handler 4 on 9000, call org.apache.hadoop.hdfs.protocol.ClientProtocol.getBlockLocations from 192.168.56.102:53610 Call#8 Retry#0: java.io.FileNotFoundException: File does not exist: /tmp/hadoop-yarn/staging/hduser/.staging/job_1504926710828_0001/job_1504926710828_0001_1.jhist
2017-09-09 11:19:29,502 INFO org.apache.hadoop.hdfs.server.blockmanagement.CacheReplicationMonitor: Rescanning after 30000 milliseconds
2017-09-09 11:19:29,502 INFO org.apache.hadoop.hdfs.server.blockmanagement.CacheReplicationMonitor: Scanned 0 directive(s) and 0 block(s) in 0 millisecond(s).
2017-09-09 11:19:42,634 INFO org.apache.hadoop.hdfs.server.namenode.FSNamesystem: Roll Edit Log from 192.168.56.100
2017-09-09 11:19:42,634 INFO org.apache.hadoop.hdfs.server.namenode.FSEditLog: Rolling edit logs
2017-09-09 11:19:42,634 INFO org.apache.hadoop.hdfs.server.namenode.FSEditLog: Ending log segment 29
2017-09-09 11:19:42,635 INFO org.apache.hadoop.hdfs.server.namenode.FSEditLog: Number of transactions: 40 Total time for transactions(ms): 6 Number of transactions batched in Syncs: 0 Number of syncs: 27 SyncTimes(ms): 545 
2017-09-09 11:19:42,704 INFO org.apache.hadoop.hdfs.server.namenode.FSEditLog: Number of transactions: 40 Total time for transactions(ms): 6 Number of transactions batched in Syncs: 0 Number of syncs: 28 SyncTimes(ms): 613 
2017-09-09 11:19:42,704 INFO org.apache.hadoop.hdfs.server.namenode.FileJournalManager: Finalizing edits file /usr/local/hadoop/hadoop_data/hdfs/namenode/current/edits_inprogress_0000000000000000029 -> /usr/local/hadoop/hadoop_data/hdfs/namenode/current/edits_0000000000000000029-0000000000000000068
2017-09-09 11:19:42,704 INFO org.apache.hadoop.hdfs.server.namenode.FSEditLog: Starting log segment at 69
2017-09-09 11:19:59,503 INFO org.apache.hadoop.hdfs.server.blockmanagement.CacheReplicationMonitor: Rescanning after 30001 milliseconds
2017-09-09 11:19:59,503 INFO org.apache.hadoop.hdfs.server.blockmanagement.CacheReplicationMonitor: Scanned 0 directive(s) and 0 block(s) in 0 millisecond(s).
2017-09-09 11:20:29,504 INFO org.apache.hadoop.hdfs.server.blockmanagement.CacheReplicationMonitor: Rescanning after 30001 milliseconds
2017-09-09 11:20:29,504 INFO org.apache.hadoop.hdfs.server.blockmanagement.CacheReplicationMonitor: Scanned 0 directive(s) and 0 block(s) in 0 millisecond(s).
2017-09-09 11:20:42,759 INFO org.apache.hadoop.hdfs.server.namenode.FSNamesystem: Roll Edit Log from 192.168.56.100
2017-09-09 11:20:42,759 INFO org.apache.hadoop.hdfs.server.namenode.FSEditLog: Rolling edit logs
2017-09-09 11:20:42,759 INFO org.apache.hadoop.hdfs.server.namenode.FSEditLog: Ending log segment 69
2017-09-09 11:20:42,759 INFO org.apache.hadoop.hdfs.server.namenode.FSEditLog: Number of transactions: 2 Total time for transactions(ms): 0 Number of transactions batched in Syncs: 0 Number of syncs: 2 SyncTimes(ms): 24 
2017-09-09 11:20:42,791 INFO org.apache.hadoop.hdfs.server.namenode.FSEditLog: Number of transactions: 2 Total time for transactions(ms): 0 Number of transactions batched in Syncs: 0 Number of syncs: 3 SyncTimes(ms): 56

在这里面报了一个错误：

 java.io.FileNotFoundException: File does not exist: /tmp/hadoop-yarn/staging/hduser/.staging/job_1504926710828_0001/job_1504926710828_0001_1.jhist

新手，不知道怎么办了

写回答
好问题 0 提建议
追加酬金
关注问题
分享
邀请回答
编辑收藏删除结题
收藏举报

1条回答默认最新

关注

码龄粉丝数原力等级 --

被采纳

被点赞

采纳率
threenewbee 2017-09-09 15:52
关注
http://blog.csdn.net/crazyzhb2012/article/details/9258247

本回答被题主选为最佳回答 , 对您是否有帮助呢?

解决无用
评论打赏
分享
举报

评论

按下Enter换行，Ctrl+Enter发表内容

报告相同问题？

关注问题

新手，hadoop上运行wordcount程序报错 hadoop
2017-09-09 08:20

回答 1 已采纳 http://blog.csdn.net/crazyzhb2012/article/details/9258247
Flink Java 运行WordCount程序报错 flink java 有问必答
2022-04-04 22:57

回答 2 已采纳本地执行需要去掉，依赖范围provided，可以查看我的微博，flink专栏
hadoop HA测试Wordcount报错 hadoop
2022-03-23 16:55

回答 1 已采纳是端口号的问题吗你将报错的信息百度一下
hadoop集群：Mapreduce-----WordCount
2022-11-01 21:25

一支鱼茄的博客 hadoop异常： org.apache.hadoop.mapreduce.lib.input.InvalidInputException: Input path does not exi。找不到或无法加载主类 org.apache.hadoop.mapreduce.v2.app.MRAppMaster。Hadoop错误：failed with state ...
运行hadoop自带的wordcount程序时报错 hadoop
2021-11-18 09:16

回答 2 已采纳不是提示要有输入（IN）和输出（OUT）么你这个参数 /hdfs/test/wctest.txt/hdfstest/output 中间没有空格不应该是 /hdfs/test/wctest.t
hadoop里hdfs导入文件报错 hadoop 有问必答
2022-01-04 19:58

回答 3 已采纳单机版的不可缺少namenode和datanode,如果是集群，只要保证一台有namenode即可，建议将所有的进程全部停掉，然后从新启动，可以使用start-all.sh，当然分开的话也可以使用st
hadoop 运行wordcount出错 hadoop
2021-10-01 19:28

回答 3 已采纳 Container [pid=7204,containerID=container_1607355221856_0001_01_000002] is running b
Hadoop大数据集群搭建（超详细）
2023-05-21 18:30

小飞飞V5的博客本文主要介绍了Hadoop集群的搭建过程，图文详细，适合新手入门
hadoop第一个程序WordCount出现的问题 hadoop hdfs 大数据
2018-11-07 13:10

回答 2 已采纳 sum += value.get(); -> sum = 1; break;
IDEA运行hadoop集群报错 hadoop idea 有问必答
2021-12-04 16:26

回答 1 已采纳你代码中获取了args中的参数，这个是需要配置的。如果觉得麻烦，直接将代码中涉及args[]取值的地方替换为具体的值。 idea中运行args作为接收参数的程序_Luke
我的jar包在hadoop运行程序出现了问题（非代码错误） hadoop java 有问必答
2021-07-27 18:50

回答 2 已采纳可参考：https://blog.csdn.net/wk51920/article/details/51698042https://stackoverflow.com/questions/145540
第五节Hadoop学习案例——MapReduce案例（WordCount）
2022-10-10 09:27

羙橘的博客第五节Hadoop学习案例——MapReduce案例（WordCount）
Linux上安装了Spark但无法运行，运行报错 hadoop spark 大数据
2023-04-21 16:19

回答 2 已采纳这篇博客: spark安装踩坑中的 2.JNI error 部分也许能够解决你的问题, 你可以仔细阅读以下内容或跳转源博客中阅读: 报错：A JNI error has occurred, pleas
hadoop中文词频统计WordCount实验
2021-11-05 19:02

夏目玲子Ling的博客由于本人上大数据课程需要做MapReduce的WordCount实验也就是统计英文单词的出现次数，这个比较简单就不多说了，今天要说的是利用IK分词对中文进行分词统计。前提准备：这里我已经安装好了Ubantu的伪分布式，带有...
大数据Hadoop入门
2022-03-09 20:23

VernonJsn的博客 1.1 IP和主机名配置 1.2 centos最小化安装需要的配置： 1.3 修改主机名和hosts文件 1.4 Hadoop102安装JDK 1.5 Hadoop102安装Hadoop 在hadoop102安装Hadoop 2.5 Hadoop目录结构 3.1 本地运行模式（官方WordCount） ...
没有解决我的问题, 去提问

悬赏问题

¥15 metadata提取的PDF元数据，如何转换为一个Excel
¥15 关于arduino编程toCharArray()函数的使用
¥100 vc++混合CEF采用CLR方式编译报错
¥15 coze 的插件输入飞书多维表格 app_token 后一直显示错误，如何解决？
¥15 vite+vue3+plyr播放本地public文件夹下视频无法加载
¥15 c#逐行读取txt文本，但是每一行里面数据之间空格数量不同
¥50 如何openEuler 22.03上安装配置drbd
¥20 ING91680C BLE5.3 芯片怎么实现串口收发数据
¥15 无线连接树莓派，无法执行update，如何解决？（相关搜索：软件下载）
¥15 Windows11, backspace, enter, space键失灵

新手，hadoop上运行wordcount程序报错

1条回答 默认 最新

悬赏问题

1条回答默认最新