2 qq 16403141 qq_16403141 于 2017.09.09 16:20 提问

新手,hadoop上运行wordcount程序报错

运行的环境是:Ubuntu14.04+hadoop2.6.1
用的是virtualBox虚拟机,然后安装了一个master和三个slave节点
hadoop是可以成功启动的,没有任何问题
在Ubuntu安装了eclipse,用java写了word count的程序,源码如下:

 package wordcount;

import java.io.IOException;
import java.util.StringTokenizer;

import org.apache.hadoop.conf.Configuration;
import org.apache.hadoop.fs.Path;
import org.apache.hadoop.io.IntWritable;
import org.apache.hadoop.io.LongWritable;
import org.apache.hadoop.io.Text;
import org.apache.hadoop.mapreduce.Job;
import org.apache.hadoop.mapreduce.Mapper;
import org.apache.hadoop.mapreduce.Reducer;
import org.apache.hadoop.mapreduce.lib.input.FileInputFormat;
import org.apache.hadoop.mapreduce.lib.output.FileOutputFormat;

/**
 * @author
 * @version 创建时间:2017年9月9日 上午8:50:51 类说明
 */
public class Wordcount {
    public static class TokenizerMapper extends Mapper<LongWritable, Text, Text, IntWritable> {
        private final static IntWritable one = new IntWritable(1);
        private Text word = new Text();

        protected void map(LongWritable key, Text value, Mapper<LongWritable, Text, Text, IntWritable>.Context context)
                throws IOException, InterruptedException {
            StringTokenizer line = new StringTokenizer(value.toString());
            while (line.hasMoreTokens()) {
                word.set(line.nextToken());
                context.write(word, one);
            }
        }
    }

    public static class IntSumReducer extends Reducer<Text, IntWritable, Text, IntWritable> {
        private IntWritable result = new IntWritable();

        protected void reduce(Text key, Iterable<IntWritable> values,
                Reducer<Text, IntWritable, Text, IntWritable>.Context context)
                throws IOException, InterruptedException {
            int sum = 0;
            for (IntWritable obj : values) {
                sum += obj.get();
            }
            result.set(sum);
            context.write(key, result);
        }
    }

    public static void main(String[] args) throws Exception {
        Configuration conf = new Configuration();

        Job job = Job.getInstance(conf, "word count");
        job.setJarByClass(Wordcount.class);

        job.setMapperClass(TokenizerMapper.class);
        job.setCombinerClass(IntSumReducer.class);
        job.setReducerClass(IntSumReducer.class);

        job.setOutputKeyClass(Text.class);
        job.setOutputValueClass(IntWritable.class);

        FileInputFormat.addInputPath(job, new Path("hdfs://master:9000/user/hduser/demo/test.txt"));
        FileOutputFormat.setOutputPath(job, new Path("hdfs://master:9000/user/hduser/demo/wordcount"));
        //FileInputFormat.addInputPath(job, new Path(args[0]));
        //FileOutputFormat.setOutputPath(job, new Path(args[1]));

        System.exit(job.waitForCompletion(true) ? 0 : 1);
    }

}

启动hadoop后,在eclipse中直接运行上面的程序,运行成功,生成了wordcount文件夹,里面有_SUCCESS文件,也有统计的结果文件

然后我想把程序打包成jar文件来运行,先把上面程序中的:

 FileInputFormat.addInputPath(job, new Path("hdfs://master:9000/user/hduser/demo/test.txt"));
        FileOutputFormat.setOutputPath(job, new Path("hdfs://master:9000/user/hduser/demo/wordcount"));

改成如下:

        FileInputFormat.addInputPath(job, new Path(args[0]));
        FileOutputFormat.setOutputPath(job, new Path(args[1]));

就是通过终端输入这两个参数
用eclipse的export打包成jar文件,后在终端输入:

 hadoop jar wordcount.jar wordcount.Wordcount hdfs://master:9000/user/hduser/demo/test.txt hdfs://master:9000/user/hduser/demo/wordcount

运行就报错了,报错情况如下:

 17/09/09 11:18:53 INFO client.RMProxy: Connecting to ResourceManager at master/192.168.56.100:8050
17/09/09 11:18:54 WARN mapreduce.JobResourceUploader: Hadoop command-line option parsing not performed. Implement the Tool interface and execute your application with ToolRunner to remedy this.
17/09/09 11:18:55 INFO input.FileInputFormat: Total input paths to process : 1
17/09/09 11:18:55 INFO mapreduce.JobSubmitter: number of splits:1
17/09/09 11:18:55 INFO mapreduce.JobSubmitter: Submitting tokens for job: job_1504926710828_0001
17/09/09 11:18:56 INFO impl.YarnClientImpl: Submitted application application_1504926710828_0001
17/09/09 11:18:56 INFO mapreduce.Job: The url to track the job: http://master:8088/proxy/application_1504926710828_0001/
17/09/09 11:18:56 INFO mapreduce.Job: Running job: job_1504926710828_0001
17/09/09 11:19:14 INFO mapreduce.Job: Job job_1504926710828_0001 running in uber mode : false
17/09/09 11:19:14 INFO mapreduce.Job:  map 0% reduce 0%
17/09/09 11:19:14 INFO mapreduce.Job: Job job_1504926710828_0001 failed with state FAILED due to: Application application_1504926710828_0001 failed 2 times due to AM Container for appattempt_1504926710828_0001_000002 exited with  exitCode: 1
For more detailed output, check application tracking page:http://master:8088/proxy/application_1504926710828_0001/Then, click on links to logs of each attempt.
Diagnostics: Exception from container-launch.
Container id: container_1504926710828_0001_02_000001
Exit code: 1
Stack trace: ExitCodeException exitCode=1: 
    at org.apache.hadoop.util.Shell.runCommand(Shell.java:538)
    at org.apache.hadoop.util.Shell.run(Shell.java:455)
    at org.apache.hadoop.util.Shell$ShellCommandExecutor.execute(Shell.java:715)
    at org.apache.hadoop.yarn.server.nodemanager.DefaultContainerExecutor.launchContainer(DefaultContainerExecutor.java:211)
    at org.apache.hadoop.yarn.server.nodemanager.containermanager.launcher.ContainerLaunch.call(ContainerLaunch.java:302)
    at org.apache.hadoop.yarn.server.nodemanager.containermanager.launcher.ContainerLaunch.call(ContainerLaunch.java:82)
    at java.util.concurrent.FutureTask.run(FutureTask.java:262)
    at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1152)
    at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:622)
    at java.lang.Thread.run(Thread.java:748)


Container exited with a non-zero exit code 1
Failing this attempt. Failing the application.
17/09/09 11:19:14 INFO mapreduce.Job: Counters: 0

去查了下日志文件,

 2017-09-09 11:18:55,869 INFO org.apache.hadoop.hdfs.StateChange: DIR* completeFile: /tmp/hadoop-yarn/staging/hduser/.staging/job_1504926710828_0001/job.xml is closed by DFSClient_NONMAPREDUCE_-1306163227_1
2017-09-09 11:18:59,502 INFO org.apache.hadoop.hdfs.server.blockmanagement.CacheReplicationMonitor: Rescanning after 30000 milliseconds
2017-09-09 11:18:59,503 INFO org.apache.hadoop.hdfs.server.blockmanagement.CacheReplicationMonitor: Scanned 0 directive(s) and 0 block(s) in 1 millisecond(s).
2017-09-09 11:19:12,241 INFO org.apache.hadoop.ipc.Server: IPC Server handler 2 on 9000, call org.apache.hadoop.hdfs.protocol.ClientProtocol.getBlockLocations from 192.168.56.102:53610 Call#7 Retry#0: java.io.FileNotFoundException: File does not exist: /tmp/hadoop-yarn/staging/hduser/.staging/job_1504926710828_0001/job_1504926710828_0001_1.jhist
2017-09-09 11:19:12,293 INFO org.apache.hadoop.ipc.Server: IPC Server handler 4 on 9000, call org.apache.hadoop.hdfs.protocol.ClientProtocol.getBlockLocations from 192.168.56.102:53610 Call#8 Retry#0: java.io.FileNotFoundException: File does not exist: /tmp/hadoop-yarn/staging/hduser/.staging/job_1504926710828_0001/job_1504926710828_0001_1.jhist
2017-09-09 11:19:29,502 INFO org.apache.hadoop.hdfs.server.blockmanagement.CacheReplicationMonitor: Rescanning after 30000 milliseconds
2017-09-09 11:19:29,502 INFO org.apache.hadoop.hdfs.server.blockmanagement.CacheReplicationMonitor: Scanned 0 directive(s) and 0 block(s) in 0 millisecond(s).
2017-09-09 11:19:42,634 INFO org.apache.hadoop.hdfs.server.namenode.FSNamesystem: Roll Edit Log from 192.168.56.100
2017-09-09 11:19:42,634 INFO org.apache.hadoop.hdfs.server.namenode.FSEditLog: Rolling edit logs
2017-09-09 11:19:42,634 INFO org.apache.hadoop.hdfs.server.namenode.FSEditLog: Ending log segment 29
2017-09-09 11:19:42,635 INFO org.apache.hadoop.hdfs.server.namenode.FSEditLog: Number of transactions: 40 Total time for transactions(ms): 6 Number of transactions batched in Syncs: 0 Number of syncs: 27 SyncTimes(ms): 545 
2017-09-09 11:19:42,704 INFO org.apache.hadoop.hdfs.server.namenode.FSEditLog: Number of transactions: 40 Total time for transactions(ms): 6 Number of transactions batched in Syncs: 0 Number of syncs: 28 SyncTimes(ms): 613 
2017-09-09 11:19:42,704 INFO org.apache.hadoop.hdfs.server.namenode.FileJournalManager: Finalizing edits file /usr/local/hadoop/hadoop_data/hdfs/namenode/current/edits_inprogress_0000000000000000029 -> /usr/local/hadoop/hadoop_data/hdfs/namenode/current/edits_0000000000000000029-0000000000000000068
2017-09-09 11:19:42,704 INFO org.apache.hadoop.hdfs.server.namenode.FSEditLog: Starting log segment at 69
2017-09-09 11:19:59,503 INFO org.apache.hadoop.hdfs.server.blockmanagement.CacheReplicationMonitor: Rescanning after 30001 milliseconds
2017-09-09 11:19:59,503 INFO org.apache.hadoop.hdfs.server.blockmanagement.CacheReplicationMonitor: Scanned 0 directive(s) and 0 block(s) in 0 millisecond(s).
2017-09-09 11:20:29,504 INFO org.apache.hadoop.hdfs.server.blockmanagement.CacheReplicationMonitor: Rescanning after 30001 milliseconds
2017-09-09 11:20:29,504 INFO org.apache.hadoop.hdfs.server.blockmanagement.CacheReplicationMonitor: Scanned 0 directive(s) and 0 block(s) in 0 millisecond(s).
2017-09-09 11:20:42,759 INFO org.apache.hadoop.hdfs.server.namenode.FSNamesystem: Roll Edit Log from 192.168.56.100
2017-09-09 11:20:42,759 INFO org.apache.hadoop.hdfs.server.namenode.FSEditLog: Rolling edit logs
2017-09-09 11:20:42,759 INFO org.apache.hadoop.hdfs.server.namenode.FSEditLog: Ending log segment 69
2017-09-09 11:20:42,759 INFO org.apache.hadoop.hdfs.server.namenode.FSEditLog: Number of transactions: 2 Total time for transactions(ms): 0 Number of transactions batched in Syncs: 0 Number of syncs: 2 SyncTimes(ms): 24 
2017-09-09 11:20:42,791 INFO org.apache.hadoop.hdfs.server.namenode.FSEditLog: Number of transactions: 2 Total time for transactions(ms): 0 Number of transactions batched in Syncs: 0 Number of syncs: 3 SyncTimes(ms): 56 

在这里面报了一个错误:

 java.io.FileNotFoundException: File does not exist: /tmp/hadoop-yarn/staging/hduser/.staging/job_1504926710828_0001/job_1504926710828_0001_1.jhist

新手,不知道怎么办了

1个回答

caozhy
caozhy   Ds   Rxr 2017.09.09 23:52
已采纳
Csdn user default icon
上传中...
上传图片
插入图片