当jar在hdfs的时候提交spark job报错 40C

(一)jar不在hdfs上的时候提交spark任务成功,使用的命令:
spark-submit --master spark://192.168.244.130:7077 --class cn.com.cnpc.klmy.common.WordCount2 --executor-memory 1G --total-executor-cores 2 /root/modelcall-2.0.jar
(二)而当jar在hdfs上的时候提交spark任务报错:classNotFoundException呢?,命令如下:
spark-submit --master spark://192.168.244.130:7077 --class cn.com.cnpc.klmy.common.WordCount2 --executor-memory 1G --total-executor-cores 2 hdfs://192.168.244.130:9000/mdjar/modelcall-2.0.jar

请教各位大咖这到底是什么原因造成的?望各位大咖不吝赐教!跪谢!!!

注:hdfs能够正常访问,代码里面产生的结果存在hdfs上(第一情况正常运行,在hdfs上能够查看到结果)

2个回答

跟你后来新问的一个一样,我在那边答了

Csdn user default icon
上传中...
上传图片
插入图片
抄袭、复制答案,以达到刷声望分或其他目的的行为,在CSDN问答是严格禁止的,一经发现立刻封号。是时候展现真正的技术了!
其他相关推荐
spark2.0报错求大神帮忙!!!谢谢!!

代码如下(网上摘录代码): package com.gree.test; import java.util.Arrays; import java.util.Iterator; import java.util.List; import org.apache.spark.SparkConf; import org.apache.spark.api.java.function.FlatMapFunction; import org.apache.spark.api.java.function.Function2; import org.apache.spark.api.java.function.PairFunction; import org.apache.spark.streaming.Durations; import org.apache.spark.streaming.api.java.JavaDStream; import org.apache.spark.streaming.api.java.JavaPairDStream; import org.apache.spark.streaming.api.java.JavaReceiverInputDStream; import org.apache.spark.streaming.api.java.JavaStreamingContext; import com.google.common.base.Optional; import scala.Tuple2; public class OnlineWordCount { public static void main(String[] args) { SparkConf conf = new SparkConf().setAppName("wordcount").setMaster("local[2]"); JavaStreamingContext jssc = new JavaStreamingContext(conf,Durations.seconds(5)); jssc.checkpoint("hdfs://spark001:9000/wordcount_checkpoint"); JavaReceiverInputDStream<String> lines = jssc.socketTextStream("spark001", 9999); JavaDStream<String> words = lines.flatMap(new FlatMapFunction<String, String>(){ private static final long serialVersionUID = 1L; @Override public Iterator<String> call(String line) throws Exception { return Arrays.asList(line.split(" ")).iterator(); } }); JavaPairDStream<String, Integer> pairs = words.mapToPair(new PairFunction<String, String, Integer>(){ private static final long serialVersionUID = 1L; @Override public Tuple2<String, Integer> call(String word) throws Exception { return new Tuple2<String, Integer>(word, 1); } }); JavaPairDStream<String, Integer> wordcounts = pairs.updateStateByKey( new Function2<List<Integer>, Optional<Integer>, Optional<Integer>>(){ private static final long serialVersionUID = 1L; @Override public Optional<Integer> call(List<Integer> values, Optional<Integer> state) throws Exception { Integer newValue = 0; if(state.isPresent()){ newValue = state.get(); } for(Integer value : values){ newValue += value; } return Optional.of(newValue); } }); wordcounts.print(); jssc.start(); try { jssc.awaitTermination(); } catch (InterruptedException e) { e.printStackTrace(); } jssc.close(); } } 报错位置为updateStateByKey位置: The method updateStateByKey(Function2<List<Integer>,Optional<S>,Optional<S>>) in the type JavaPairDStream<String,Integer> is not applicable for the arguments (new Function2<List<Integer>,Optional<Integer>,Optional<Integer>>(){}) 跪求大神解决。。。谢谢

spark任务spark-submit集群运行报错。

报错如下:SLF4J: Class path contains multiple SLF4J bindings. SLF4J: Found binding in [jar:file:/home/spark/jars/slf4j-log4j12-1.7.16.jar!/org/slf4j/impl/StaticLoggerBinder.class] SLF4J: Found binding in [jar:file:/home/hadoop/share/hadoop/common/lib/slf4j-log4j12-1.7.5.jar!/org/slf4j/impl/StaticLoggerBinder.class] SLF4J: See http://www.slf4j.org/codes.html#multiple_bindings for an explanation. SLF4J: Actual binding is of type [org.slf4j.impl.Log4jLoggerFactory] . ____ _ __ _ _ /\\ / ___'_ __ _ _(_)_ __ __ _ \ \ \ \ ( ( )\___ | '_ | '_| | '_ \/ _` | \ \ \ \ \\/ ___)| |_)| | | | | || (_| | ) ) ) ) ' |____| .__|_| |_|_| |_\__, | / / / / =========|_|==============|___/=/_/_/_/ :: Spring Boot :: 19/03/15 11:28:57 INFO demo.DemoApplication: Starting DemoApplication on spark01 with PID 15249 (/home/demo.jar started by root in /usr/sbin) 19/03/15 11:28:57 INFO demo.DemoApplication: No active profile set, falling back to default profiles: default 19/03/15 11:28:57 INFO context.AnnotationConfigServletWebServerApplicationContext: Refreshing org.springframework.boot.web.servlet.context.AnnotationConfigServletWebServerApplicationContext@3b79fd76: startup date [Fri Mar 15 11:28:57 CST 2019]; root of context hierarchy 19/03/15 11:28:59 INFO annotation.AutowiredAnnotationBeanPostProcessor: JSR-330 'javax.inject.Inject' annotation found and supported for autowiring 19/03/15 11:28:59 WARN context.AnnotationConfigServletWebServerApplicationContext: Exception encountered during context initialization - cancelling refresh attempt: org.springframework.context.ApplicationContextException: Unable to start web server; nested exception is org.springframework.context.ApplicationContextException: Unable to start ServletWebServerApplicationContext due to missing ServletWebServerFactory bean. 19/03/15 11:28:59 ERROR boot.SpringApplication: Application run failed org.springframework.context.ApplicationContextException: Unable to start web server; nested exception is org.springframework.context.ApplicationContextException: Unable to start ServletWebServerApplicationContext due to missing ServletWebServerFactory bean. at org.springframework.boot.web.servlet.context.ServletWebServerApplicationContext.onRefresh(ServletWebServerApplicationContext.java:155) at org.springframework.context.support.AbstractApplicationContext.refresh(AbstractApplicationContext.java:543) at org.springframework.boot.web.servlet.context.ServletWebServerApplicationContext.refresh(ServletWebServerApplicationContext.java:140) at org.springframework.boot.SpringApplication.refresh(SpringApplication.java:752) at org.springframework.boot.SpringApplication.refreshContext(SpringApplication.java:388) at org.springframework.boot.SpringApplication.run(SpringApplication.java:327) at org.springframework.boot.SpringApplication.run(SpringApplication.java:1246) at org.springframework.boot.SpringApplication.run(SpringApplication.java:1234) at com.spark.demo.DemoApplication.main(DemoApplication.java:12) at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method) at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62) at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43) at java.lang.reflect.Method.invoke(Method.java:497) at org.apache.spark.deploy.SparkSubmit$.org$apache$spark$deploy$SparkSubmit$$runMain(SparkSubmit.scala:755) at org.apache.spark.deploy.SparkSubmit$.doRunMain$1(SparkSubmit.scala:180) at org.apache.spark.deploy.SparkSubmit$.submit(SparkSubmit.scala:205) at org.apache.spark.deploy.SparkSubmit$.main(SparkSubmit.scala:119) at org.apache.spark.deploy.SparkSubmit.main(SparkSubmit.scala) Caused by: org.springframework.context.ApplicationContextException: Unable to start ServletWebServerApplicationContext due to missing ServletWebServerFactory bean. at org.springframework.boot.web.servlet.context.ServletWebServerApplicationContext.getWebServerFactory(ServletWebServerApplicationContext.java:204) at org.springframework.boot.web.servlet.context.ServletWebServerApplicationContext.createWebServer(ServletWebServerApplicationContext.java:178) at org.springframework.boot.web.servlet.context.ServletWebServerApplicationContext.onRefresh(ServletWebServerApplicationContext.java:152) ... 17 more

spark submit 提交集群任务后,spark Web UI界面不显示,但是有4040界面,显示local模式

遇到如下问题,求教大神: 集群有三个节点,111为master。剩余两个为slave。每个节点 4核,6.6G。 提交命令如下 nohup bin/spark-submit --master spark://sousou:7077 --executor-memory 1g --total-executor-cores 2 --class AnalyzeInfo /spark/jar/v2_AnalyzeInfo.jar & nohup bin/spark-submit --master spark://sousou111:7077 --executor-memory 1g --total-executor-cores 2 --class SaveInfoMain /spark/jar/saveAnn.jar & 问题如下: 1. spark submit 提交集群任务后,spark Web UI界面不显示SaveInfoMain,但是有4040界面,且查看界面Environment显示local模式。这是为什么啊?这样造成的问题是程序没有办法在界面停止。且这个程序有时候会造成处理数据异常缓慢,偶尔处理三四个小时之前的数据,AnalyzeInfo这个任务就不会产生这个问题。 2. 而且这两个任务出现的共同点是:我设置的触发HDFS上的目录下文件就优雅停止程序,刚运行时还可以,但是这两个程序运行时间长了,比如说一天后我上传到HDFS上文件,这两程序就不能成功停止了。 Environment图片如下: ![图片说明](https://img-ask.csdn.net/upload/201810/23/1540264892_86550.png) ![图片说明](https://img-ask.csdn.net/upload/201810/23/1540264909_714074.png)

Spark scala 运行报错 .test$$anonfun$1

同样的写法再scala中执行报错,然而在java中能够正常执行 以下是报错内容 ``` helloOffSLF4J: Class path contains multiple SLF4J bindings. SLF4J: Found binding in [jar:file:/C:/Users/zsts/.m2/repository/org/apache/logging/log4j/log4j-slf4j-impl/2.5/log4j-slf4j-impl-2.5.jar!/org/slf4j/impl/StaticLoggerBinder.class] SLF4J: Found binding in [jar:file:/C:/Users/zsts/.m2/repository/org/slf4j/slf4j-log4j12/1.7.10/slf4j-log4j12-1.7.10.jar!/org/slf4j/impl/StaticLoggerBinder.class] SLF4J: See http://www.slf4j.org/codes.html#multiple_bindings for an explanation. SLF4J: Actual binding is of type [org.apache.logging.slf4j.Log4jLoggerFactory] ERROR StatusLogger No log4j2 configuration file found. Using default configuration: logging only errors to the console. 19:25:11.875 [main] ERROR org.apache.hadoop.util.Shell - Failed to locate the winutils binary in the hadoop binary path java.io.IOException: Could not locate executable null\bin\winutils.exe in the Hadoop binaries. at org.apache.hadoop.util.Shell.getQualifiedBinPath(Shell.java:355) ~[hadoop-common-2.6.4.jar:?] at org.apache.hadoop.util.Shell.getWinUtilsPath(Shell.java:370) [hadoop-common-2.6.4.jar:?] at org.apache.hadoop.util.Shell.<clinit>(Shell.java:363) [hadoop-common-2.6.4.jar:?] at org.apache.hadoop.util.StringUtils.<clinit>(StringUtils.java:79) [hadoop-common-2.6.4.jar:?] at org.apache.hadoop.security.Groups.parseStaticMapping(Groups.java:116) [hadoop-common-2.6.4.jar:?] at org.apache.hadoop.security.Groups.<init>(Groups.java:93) [hadoop-common-2.6.4.jar:?] at org.apache.hadoop.security.Groups.<init>(Groups.java:73) [hadoop-common-2.6.4.jar:?] at org.apache.hadoop.security.Groups.getUserToGroupsMappingService(Groups.java:293) [hadoop-common-2.6.4.jar:?] at org.apache.hadoop.security.UserGroupInformation.initialize(UserGroupInformation.java:283) [hadoop-common-2.6.4.jar:?] at org.apache.hadoop.security.UserGroupInformation.ensureInitialized(UserGroupInformation.java:260) [hadoop-common-2.6.4.jar:?] at org.apache.hadoop.security.UserGroupInformation.loginUserFromSubject(UserGroupInformation.java:789) [hadoop-common-2.6.4.jar:?] at org.apache.hadoop.security.UserGroupInformation.getLoginUser(UserGroupInformation.java:774) [hadoop-common-2.6.4.jar:?] at org.apache.hadoop.security.UserGroupInformation.getCurrentUser(UserGroupInformation.java:647) [hadoop-common-2.6.4.jar:?] at org.apache.spark.util.Utils$$anonfun$getCurrentUserName$1.apply(Utils.scala:2198) [spark-core_2.10-1.6.2.jar:1.6.2] at org.apache.spark.util.Utils$$anonfun$getCurrentUserName$1.apply(Utils.scala:2198) [spark-core_2.10-1.6.2.jar:1.6.2] at scala.Option.getOrElse(Option.scala:120) [?:?] at org.apache.spark.util.Utils$.getCurrentUserName(Utils.scala:2198) [spark-core_2.10-1.6.2.jar:1.6.2] at org.apache.spark.SparkContext.<init>(SparkContext.scala:322) [spark-core_2.10-1.6.2.jar:1.6.2] at tarot.test$.main(test.scala:19) [bin/:?] at tarot.test.main(test.scala) [bin/:?] Exception in thread "main" org.apache.hadoop.mapred.InvalidInputException: Input path does not exist: hdfs://tarot1:9000/sparkTest/hello at org.apache.hadoop.mapred.FileInputFormat.singleThreadedListStatus(FileInputFormat.java:285) at org.apache.hadoop.mapred.FileInputFormat.listStatus(FileInputFormat.java:228) at org.apache.hadoop.mapred.FileInputFormat.getSplits(FileInputFormat.java:313) at org.apache.spark.rdd.HadoopRDD.getPartitions(HadoopRDD.scala:199) at org.apache.spark.rdd.RDD$$anonfun$partitions$2.apply(RDD.scala:239) at org.apache.spark.rdd.RDD$$anonfun$partitions$2.apply(RDD.scala:237) at scala.Option.getOrElse(Option.scala:120) at org.apache.spark.rdd.RDD.partitions(RDD.scala:237) at org.apache.spark.rdd.MapPartitionsRDD.getPartitions(MapPartitionsRDD.scala:35) at org.apache.spark.rdd.RDD$$anonfun$partitions$2.apply(RDD.scala:239) at org.apache.spark.rdd.RDD$$anonfun$partitions$2.apply(RDD.scala:237) at scala.Option.getOrElse(Option.scala:120) at org.apache.spark.rdd.RDD.partitions(RDD.scala:237) at org.apache.spark.rdd.MapPartitionsRDD.getPartitions(MapPartitionsRDD.scala:35) at org.apache.spark.rdd.RDD$$anonfun$partitions$2.apply(RDD.scala:239) at org.apache.spark.rdd.RDD$$anonfun$partitions$2.apply(RDD.scala:237) at scala.Option.getOrElse(Option.scala:120) at org.apache.spark.rdd.RDD.partitions(RDD.scala:237) at org.apache.spark.rdd.RDD$$anonfun$take$1.apply(RDD.scala:1307) at org.apache.spark.rdd.RDDOperationScope$.withScope(RDDOperationScope.scala:150) at org.apache.spark.rdd.RDDOperationScope$.withScope(RDDOperationScope.scala:111) at org.apache.spark.rdd.RDD.withScope(RDD.scala:316) at org.apache.spark.rdd.RDD.take(RDD.scala:1302) at tarot.test$.main(test.scala:26) at tarot.test.main(test.scala) ``` scala代码 ``` package tarot import scala.tools.nsc.doc.model.Val import org.apache.spark.SparkConf import org.apache.spark.SparkContext import org.apache.spark.api.java.JavaSparkContext object test { def main(hello:Array[String]){ print("helloOff") val conf = new SparkConf() conf.setMaster("spark://tarot1:7077") .setAppName("hello_off") .set("spark.executor.memory", "4g") .set("spark.executor.cores", "4") .set("spark.cores.max", "4") .set("spark.sql.crossJoin.enabled", "true") val sc = new SparkContext(conf) sc.setLogLevel("ERROR") val file = sc.textFile("hdfs://tarot1:9000/sparkTest/hello") val filterRDD = file.filter { (ss:String) => ss.contains("hello") } val f=filterRDD.cache() // println(f) // filterRDD.count() for(x <- f.take(100)){ println(x) } } def helloingTest(jsc:SparkContext){ val sc = jsc val file = sc.textFile("hdfs://tarot1:9000/sparkTest/hello") val filterRDD = file.filter((ss:String) => ss.contains("hello")) val f=filterRDD.cache() println(f) val i = filterRDD.count() println(i) } // val seehello = def helloingTest(jsc:JavaSparkContext){ val sc = jsc val file = sc.textFile("hdfs://tarot1:9000/sparkTest/hello") val filterRDD = file.filter((ss:String) => ss.contains("hello")) val f=filterRDD.cache() println(f) val i = filterRDD.count() println(i) } } ``` java代码 ``` package com.tarot.sparkToHdfsTest; import org.apache.spark.SparkConf; import org.apache.spark.SparkContext; import org.apache.spark.api.java.JavaRDD; import org.apache.spark.api.java.JavaSparkContext; import org.apache.spark.api.java.function.Function; import org.apache.spark.rdd.RDD; //import scalaModule.hello; public class App { public static void main( String[] args ) { SparkConf sparkConf = new SparkConf(); sparkConf.setMaster("spark://tarot1:7077") .setAppName("hello off") .set("spark.executor.memory","4g") .set("spark.executor.cores", "4") .set("spark.cores.max","4") .set("spark.sql.crossJoin.enabled", "true"); JavaSparkContext jsc = new JavaSparkContext(sparkConf); jsc.setLogLevel("ERROR"); text(jsc); // test.helloingTest(jsc); } /** * test * @param jsc */ private static void text(JavaSparkContext jsc){ // jsc.textFile("hdfs://tarot1:9000/sparkTest/hello"); JavaRDD<String> jr= jsc.textFile("hdfs://tarot1:9000/sparkTest/hello",1); jr.cache(); // test t = new test(); jr.filter(f); for (String string : jr.take(100)) { System.out.println(string); } System.out.println("hello off"); } public static Function<String, Boolean> f = new Function<String, Boolean>() { public Boolean call(String s) { return s.contains("hello"); } }; } ``` 坑了很久了,网上的解决办法不是让我去shell上就是让我上传jar包,但是java不用啊?都调用的JVM。 大神救命

spark 读取不到hive metastore 获取不到数据库

直接上异常 ``` Using Spark's default log4j profile: org/apache/spark/log4j-defaults.properties SLF4J: Class path contains multiple SLF4J bindings. SLF4J: Found binding in [jar:file:/data01/hadoop/yarn/local/filecache/355/spark2-hdp-yarn-archive.tar.gz/slf4j-log4j12-1.7.16.jar!/org/slf4j/impl/StaticLoggerBinder.class] SLF4J: Found binding in [jar:file:/usr/hdp/2.6.5.0-292/hadoop/lib/slf4j-log4j12-1.7.10.jar!/org/slf4j/impl/StaticLoggerBinder.class] SLF4J: See http://www.slf4j.org/codes.html#multiple_bindings for an explanation. SLF4J: Actual binding is of type [org.slf4j.impl.Log4jLoggerFactory] 19/08/13 19:53:17 INFO SignalUtils: Registered signal handler for TERM 19/08/13 19:53:17 INFO SignalUtils: Registered signal handler for HUP 19/08/13 19:53:17 INFO SignalUtils: Registered signal handler for INT 19/08/13 19:53:17 INFO SecurityManager: Changing view acls to: yarn,hdfs 19/08/13 19:53:17 INFO SecurityManager: Changing modify acls to: yarn,hdfs 19/08/13 19:53:17 INFO SecurityManager: Changing view acls groups to: 19/08/13 19:53:17 INFO SecurityManager: Changing modify acls groups to: 19/08/13 19:53:17 INFO SecurityManager: SecurityManager: authentication disabled; ui acls disabled; users with view permissions: Set(yarn, hdfs); groups with view permissions: Set(); users with modify permissions: Set(yarn, hdfs); groups with modify permissions: Set() 19/08/13 19:53:18 INFO ApplicationMaster: Preparing Local resources 19/08/13 19:53:19 INFO ApplicationMaster: ApplicationAttemptId: appattempt_1565610088533_0087_000001 19/08/13 19:53:19 INFO ApplicationMaster: Starting the user application in a separate Thread 19/08/13 19:53:19 INFO ApplicationMaster: Waiting for spark context initialization... 19/08/13 19:53:19 INFO SparkContext: Running Spark version 2.3.0.2.6.5.0-292 19/08/13 19:53:19 INFO SparkContext: Submitted application: voice_stream 19/08/13 19:53:19 INFO SecurityManager: Changing view acls to: yarn,hdfs 19/08/13 19:53:19 INFO SecurityManager: Changing modify acls to: yarn,hdfs 19/08/13 19:53:19 INFO SecurityManager: Changing view acls groups to: 19/08/13 19:53:19 INFO SecurityManager: Changing modify acls groups to: 19/08/13 19:53:19 INFO SecurityManager: SecurityManager: authentication disabled; ui acls disabled; users with view permissions: Set(yarn, hdfs); groups with view permissions: Set(); users with modify permissions: Set(yarn, hdfs); groups with modify permissions: Set() 19/08/13 19:53:19 INFO Utils: Successfully started service 'sparkDriver' on port 20410. 19/08/13 19:53:19 INFO SparkEnv: Registering MapOutputTracker 19/08/13 19:53:19 INFO SparkEnv: Registering BlockManagerMaster 19/08/13 19:53:19 INFO BlockManagerMasterEndpoint: Using org.apache.spark.storage.DefaultTopologyMapper for getting topology information 19/08/13 19:53:19 INFO BlockManagerMasterEndpoint: BlockManagerMasterEndpoint up 19/08/13 19:53:19 INFO DiskBlockManager: Created local directory at /data01/hadoop/yarn/local/usercache/hdfs/appcache/application_1565610088533_0087/blockmgr-94d35b97-43b2-496e-a4cb-73ecd3ed186c 19/08/13 19:53:19 INFO MemoryStore: MemoryStore started with capacity 366.3 MB 19/08/13 19:53:19 INFO SparkEnv: Registering OutputCommitCoordinator 19/08/13 19:53:19 INFO JettyUtils: Adding filter: org.apache.hadoop.yarn.server.webproxy.amfilter.AmIpFilter 19/08/13 19:53:19 INFO Utils: Successfully started service 'SparkUI' on port 28852. 19/08/13 19:53:19 INFO SparkUI: Bound SparkUI to 0.0.0.0, and started at http://datanode02:28852 19/08/13 19:53:19 INFO YarnClusterScheduler: Created YarnClusterScheduler 19/08/13 19:53:20 INFO SchedulerExtensionServices: Starting Yarn extension services with app application_1565610088533_0087 and attemptId Some(appattempt_1565610088533_0087_000001) 19/08/13 19:53:20 INFO Utils: Successfully started service 'org.apache.spark.network.netty.NettyBlockTransferService' on port 31984. 19/08/13 19:53:20 INFO NettyBlockTransferService: Server created on datanode02:31984 19/08/13 19:53:20 INFO BlockManager: Using org.apache.spark.storage.RandomBlockReplicationPolicy for block replication policy 19/08/13 19:53:20 INFO BlockManagerMaster: Registering BlockManager BlockManagerId(driver, datanode02, 31984, None) 19/08/13 19:53:20 INFO BlockManagerMasterEndpoint: Registering block manager datanode02:31984 with 366.3 MB RAM, BlockManagerId(driver, datanode02, 31984, None) 19/08/13 19:53:20 INFO BlockManagerMaster: Registered BlockManager BlockManagerId(driver, datanode02, 31984, None) 19/08/13 19:53:20 INFO BlockManager: Initialized BlockManager: BlockManagerId(driver, datanode02, 31984, None) 19/08/13 19:53:20 INFO EventLoggingListener: Logging events to hdfs:/spark2-history/application_1565610088533_0087_1 19/08/13 19:53:20 INFO ApplicationMaster: =============================================================================== YARN executor launch context: env: CLASSPATH -> {{PWD}}<CPS>{{PWD}}/__spark_conf__<CPS>{{PWD}}/__spark_libs__/*<CPS>/usr/hdp/2.6.5.0-292/hadoop/conf<CPS>/usr/hdp/2.6.5.0-292/hadoop/*<CPS>/usr/hdp/2.6.5.0-292/hadoop/lib/*<CPS>/usr/hdp/current/hadoop-hdfs-client/*<CPS>/usr/hdp/current/hadoop-hdfs-client/lib/*<CPS>/usr/hdp/current/hadoop-yarn-client/*<CPS>/usr/hdp/current/hadoop-yarn-client/lib/*<CPS>/usr/hdp/current/ext/hadoop/*<CPS>$PWD/mr-framework/hadoop/share/hadoop/mapreduce/*:$PWD/mr-framework/hadoop/share/hadoop/mapreduce/lib/*:$PWD/mr-framework/hadoop/share/hadoop/common/*:$PWD/mr-framework/hadoop/share/hadoop/common/lib/*:$PWD/mr-framework/hadoop/share/hadoop/yarn/*:$PWD/mr-framework/hadoop/share/hadoop/yarn/lib/*:$PWD/mr-framework/hadoop/share/hadoop/hdfs/*:$PWD/mr-framework/hadoop/share/hadoop/hdfs/lib/*:$PWD/mr-framework/hadoop/share/hadoop/tools/lib/*:/usr/hdp/2.6.5.0-292/hadoop/lib/hadoop-lzo-0.6.0.2.6.5.0-292.jar:/etc/hadoop/conf/secure:/usr/hdp/current/ext/hadoop/*<CPS>{{PWD}}/__spark_conf__/__hadoop_conf__ SPARK_YARN_STAGING_DIR -> *********(redacted) SPARK_USER -> *********(redacted) command: LD_LIBRARY_PATH="/usr/hdp/current/hadoop-client/lib/native:/usr/hdp/current/hadoop-client/lib/native/Linux-amd64-64:$LD_LIBRARY_PATH" \ {{JAVA_HOME}}/bin/java \ -server \ -Xmx5120m \ -Djava.io.tmpdir={{PWD}}/tmp \ '-Dspark.history.ui.port=18081' \ '-Dspark.rpc.message.maxSize=100' \ -Dspark.yarn.app.container.log.dir=<LOG_DIR> \ -XX:OnOutOfMemoryError='kill %p' \ org.apache.spark.executor.CoarseGrainedExecutorBackend \ --driver-url \ spark://CoarseGrainedScheduler@datanode02:20410 \ --executor-id \ <executorId> \ --hostname \ <hostname> \ --cores \ 2 \ --app-id \ application_1565610088533_0087 \ --user-class-path \ file:$PWD/__app__.jar \ --user-class-path \ file:$PWD/hadoop-common-2.7.3.jar \ --user-class-path \ file:$PWD/guava-12.0.1.jar \ --user-class-path \ file:$PWD/hbase-server-1.2.8.jar \ --user-class-path \ file:$PWD/hbase-protocol-1.2.8.jar \ --user-class-path \ file:$PWD/hbase-client-1.2.8.jar \ --user-class-path \ file:$PWD/hbase-common-1.2.8.jar \ --user-class-path \ file:$PWD/mysql-connector-java-5.1.44-bin.jar \ --user-class-path \ file:$PWD/spark-streaming-kafka-0-8-assembly_2.11-2.3.2.jar \ --user-class-path \ file:$PWD/spark-examples_2.11-1.6.0-typesafe-001.jar \ --user-class-path \ file:$PWD/fastjson-1.2.7.jar \ 1><LOG_DIR>/stdout \ 2><LOG_DIR>/stderr resources: spark-streaming-kafka-0-8-assembly_2.11-2.3.2.jar -> resource { scheme: "hdfs" host: "CID-042fb939-95b4-4b74-91b8-9f94b999bdf7" port: -1 file: "/user/hdfs/.sparkStaging/application_1565610088533_0087/spark-streaming-kafka-0-8-assembly_2.11-2.3.2.jar" } size: 12271027 timestamp: 1565697198603 type: FILE visibility: PRIVATE spark-examples_2.11-1.6.0-typesafe-001.jar -> resource { scheme: "hdfs" host: "CID-042fb939-95b4-4b74-91b8-9f94b999bdf7" port: -1 file: "/user/hdfs/.sparkStaging/application_1565610088533_0087/spark-examples_2.11-1.6.0-typesafe-001.jar" } size: 1867746 timestamp: 1565697198751 type: FILE visibility: PRIVATE hbase-server-1.2.8.jar -> resource { scheme: "hdfs" host: "CID-042fb939-95b4-4b74-91b8-9f94b999bdf7" port: -1 file: "/user/hdfs/.sparkStaging/application_1565610088533_0087/hbase-server-1.2.8.jar" } size: 4197896 timestamp: 1565697197770 type: FILE visibility: PRIVATE hbase-common-1.2.8.jar -> resource { scheme: "hdfs" host: "CID-042fb939-95b4-4b74-91b8-9f94b999bdf7" port: -1 file: "/user/hdfs/.sparkStaging/application_1565610088533_0087/hbase-common-1.2.8.jar" } size: 570163 timestamp: 1565697198318 type: FILE visibility: PRIVATE __app__.jar -> resource { scheme: "hdfs" host: "CID-042fb939-95b4-4b74-91b8-9f94b999bdf7" port: -1 file: "/user/hdfs/.sparkStaging/application_1565610088533_0087/spark_history_data2.jar" } size: 44924 timestamp: 1565697197260 type: FILE visibility: PRIVATE guava-12.0.1.jar -> resource { scheme: "hdfs" host: "CID-042fb939-95b4-4b74-91b8-9f94b999bdf7" port: -1 file: "/user/hdfs/.sparkStaging/application_1565610088533_0087/guava-12.0.1.jar" } size: 1795932 timestamp: 1565697197614 type: FILE visibility: PRIVATE hbase-client-1.2.8.jar -> resource { scheme: "hdfs" host: "CID-042fb939-95b4-4b74-91b8-9f94b999bdf7" port: -1 file: "/user/hdfs/.sparkStaging/application_1565610088533_0087/hbase-client-1.2.8.jar" } size: 1306401 timestamp: 1565697198180 type: FILE visibility: PRIVATE __spark_conf__ -> resource { scheme: "hdfs" host: "CID-042fb939-95b4-4b74-91b8-9f94b999bdf7" port: -1 file: "/user/hdfs/.sparkStaging/application_1565610088533_0087/__spark_conf__.zip" } size: 273513 timestamp: 1565697199131 type: ARCHIVE visibility: PRIVATE fastjson-1.2.7.jar -> resource { scheme: "hdfs" host: "CID-042fb939-95b4-4b74-91b8-9f94b999bdf7" port: -1 file: "/user/hdfs/.sparkStaging/application_1565610088533_0087/fastjson-1.2.7.jar" } size: 417221 timestamp: 1565697198865 type: FILE visibility: PRIVATE hbase-protocol-1.2.8.jar -> resource { scheme: "hdfs" host: "CID-042fb939-95b4-4b74-91b8-9f94b999bdf7" port: -1 file: "/user/hdfs/.sparkStaging/application_1565610088533_0087/hbase-protocol-1.2.8.jar" } size: 4366252 timestamp: 1565697198023 type: FILE visibility: PRIVATE __spark_libs__ -> resource { scheme: "hdfs" host: "CID-042fb939-95b4-4b74-91b8-9f94b999bdf7" port: -1 file: "/hdp/apps/2.6.5.0-292/spark2/spark2-hdp-yarn-archive.tar.gz" } size: 227600110 timestamp: 1549953820247 type: ARCHIVE visibility: PUBLIC mysql-connector-java-5.1.44-bin.jar -> resource { scheme: "hdfs" host: "CID-042fb939-95b4-4b74-91b8-9f94b999bdf7" port: -1 file: "/user/hdfs/.sparkStaging/application_1565610088533_0087/mysql-connector-java-5.1.44-bin.jar" } size: 999635 timestamp: 1565697198445 type: FILE visibility: PRIVATE hadoop-common-2.7.3.jar -> resource { scheme: "hdfs" host: "CID-042fb939-95b4-4b74-91b8-9f94b999bdf7" port: -1 file: "/user/hdfs/.sparkStaging/application_1565610088533_0087/hadoop-common-2.7.3.jar" } size: 3479293 timestamp: 1565697197476 type: FILE visibility: PRIVATE =============================================================================== 19/08/13 19:53:20 INFO RMProxy: Connecting to ResourceManager at namenode02/10.1.38.38:8030 19/08/13 19:53:20 INFO YarnRMClient: Registering the ApplicationMaster 19/08/13 19:53:20 INFO YarnAllocator: Will request 3 executor container(s), each with 2 core(s) and 5632 MB memory (including 512 MB of overhead) 19/08/13 19:53:20 INFO YarnSchedulerBackend$YarnSchedulerEndpoint: ApplicationMaster registered as NettyRpcEndpointRef(spark://YarnAM@datanode02:20410) 19/08/13 19:53:20 INFO YarnAllocator: Submitted 3 unlocalized container requests. 19/08/13 19:53:20 INFO ApplicationMaster: Started progress reporter thread with (heartbeat : 3000, initial allocation : 200) intervals 19/08/13 19:53:20 INFO AMRMClientImpl: Received new token for : datanode03:45454 19/08/13 19:53:21 INFO YarnAllocator: Launching container container_e20_1565610088533_0087_01_000002 on host datanode03 for executor with ID 1 19/08/13 19:53:21 INFO YarnAllocator: Received 1 containers from YARN, launching executors on 1 of them. 19/08/13 19:53:21 INFO ContainerManagementProtocolProxy: yarn.client.max-cached-nodemanagers-proxies : 0 19/08/13 19:53:21 INFO ContainerManagementProtocolProxy: Opening proxy : datanode03:45454 19/08/13 19:53:21 INFO AMRMClientImpl: Received new token for : datanode01:45454 19/08/13 19:53:21 INFO YarnAllocator: Launching container container_e20_1565610088533_0087_01_000003 on host datanode01 for executor with ID 2 19/08/13 19:53:21 INFO YarnAllocator: Received 1 containers from YARN, launching executors on 1 of them. 19/08/13 19:53:21 INFO ContainerManagementProtocolProxy: yarn.client.max-cached-nodemanagers-proxies : 0 19/08/13 19:53:21 INFO ContainerManagementProtocolProxy: Opening proxy : datanode01:45454 19/08/13 19:53:22 INFO AMRMClientImpl: Received new token for : datanode02:45454 19/08/13 19:53:22 INFO YarnAllocator: Launching container container_e20_1565610088533_0087_01_000004 on host datanode02 for executor with ID 3 19/08/13 19:53:22 INFO YarnAllocator: Received 1 containers from YARN, launching executors on 1 of them. 19/08/13 19:53:22 INFO ContainerManagementProtocolProxy: yarn.client.max-cached-nodemanagers-proxies : 0 19/08/13 19:53:22 INFO ContainerManagementProtocolProxy: Opening proxy : datanode02:45454 19/08/13 19:53:24 INFO YarnSchedulerBackend$YarnDriverEndpoint: Registered executor NettyRpcEndpointRef(spark-client://Executor) (10.1.198.144:41122) with ID 1 19/08/13 19:53:25 INFO YarnSchedulerBackend$YarnDriverEndpoint: Registered executor NettyRpcEndpointRef(spark-client://Executor) (10.1.229.163:24656) with ID 3 19/08/13 19:53:25 INFO BlockManagerMasterEndpoint: Registering block manager datanode03:3328 with 2.5 GB RAM, BlockManagerId(1, datanode03, 3328, None) 19/08/13 19:53:25 INFO BlockManagerMasterEndpoint: Registering block manager datanode02:28863 with 2.5 GB RAM, BlockManagerId(3, datanode02, 28863, None) 19/08/13 19:53:25 INFO YarnSchedulerBackend$YarnDriverEndpoint: Registered executor NettyRpcEndpointRef(spark-client://Executor) (10.1.229.158:64276) with ID 2 19/08/13 19:53:25 INFO YarnClusterSchedulerBackend: SchedulerBackend is ready for scheduling beginning after reached minRegisteredResourcesRatio: 0.8 19/08/13 19:53:25 INFO YarnClusterScheduler: YarnClusterScheduler.postStartHook done 19/08/13 19:53:25 INFO BlockManagerMasterEndpoint: Registering block manager datanode01:20487 with 2.5 GB RAM, BlockManagerId(2, datanode01, 20487, None) 19/08/13 19:53:25 WARN SparkContext: Using an existing SparkContext; some configuration may not take effect. 19/08/13 19:53:25 INFO SparkContext: Starting job: start at VoiceApplication2.java:128 19/08/13 19:53:25 INFO DAGScheduler: Registering RDD 1 (start at VoiceApplication2.java:128) 19/08/13 19:53:25 INFO DAGScheduler: Got job 0 (start at VoiceApplication2.java:128) with 20 output partitions 19/08/13 19:53:25 INFO DAGScheduler: Final stage: ResultStage 1 (start at VoiceApplication2.java:128) 19/08/13 19:53:25 INFO DAGScheduler: Parents of final stage: List(ShuffleMapStage 0) 19/08/13 19:53:25 INFO DAGScheduler: Missing parents: List(ShuffleMapStage 0) 19/08/13 19:53:26 INFO DAGScheduler: Submitting ShuffleMapStage 0 (MapPartitionsRDD[1] at start at VoiceApplication2.java:128), which has no missing parents 19/08/13 19:53:26 INFO MemoryStore: Block broadcast_0 stored as values in memory (estimated size 3.1 KB, free 366.3 MB) 19/08/13 19:53:26 INFO MemoryStore: Block broadcast_0_piece0 stored as bytes in memory (estimated size 2011.0 B, free 366.3 MB) 19/08/13 19:53:26 INFO BlockManagerInfo: Added broadcast_0_piece0 in memory on datanode02:31984 (size: 2011.0 B, free: 366.3 MB) 19/08/13 19:53:26 INFO SparkContext: Created broadcast 0 from broadcast at DAGScheduler.scala:1039 19/08/13 19:53:26 INFO DAGScheduler: Submitting 50 missing tasks from ShuffleMapStage 0 (MapPartitionsRDD[1] at start at VoiceApplication2.java:128) (first 15 tasks are for partitions Vector(0, 1, 2, 3, 4, 5, 6, 7, 8, 9, 10, 11, 12, 13, 14)) 19/08/13 19:53:26 INFO YarnClusterScheduler: Adding task set 0.0 with 50 tasks 19/08/13 19:53:26 INFO TaskSetManager: Starting task 0.0 in stage 0.0 (TID 0, datanode02, executor 3, partition 0, PROCESS_LOCAL, 7831 bytes) 19/08/13 19:53:26 INFO TaskSetManager: Starting task 1.0 in stage 0.0 (TID 1, datanode03, executor 1, partition 1, PROCESS_LOCAL, 7831 bytes) 19/08/13 19:53:26 INFO TaskSetManager: Starting task 2.0 in stage 0.0 (TID 2, datanode01, executor 2, partition 2, PROCESS_LOCAL, 7831 bytes) 19/08/13 19:53:26 INFO TaskSetManager: Starting task 3.0 in stage 0.0 (TID 3, datanode02, executor 3, partition 3, PROCESS_LOCAL, 7831 bytes) 19/08/13 19:53:26 INFO TaskSetManager: Starting task 4.0 in stage 0.0 (TID 4, datanode03, executor 1, partition 4, PROCESS_LOCAL, 7831 bytes) 19/08/13 19:53:26 INFO TaskSetManager: Starting task 5.0 in stage 0.0 (TID 5, datanode01, executor 2, partition 5, PROCESS_LOCAL, 7831 bytes) 19/08/13 19:53:26 INFO BlockManagerInfo: Added broadcast_0_piece0 in memory on datanode02:28863 (size: 2011.0 B, free: 2.5 GB) 19/08/13 19:53:26 INFO BlockManagerInfo: Added broadcast_0_piece0 in memory on datanode03:3328 (size: 2011.0 B, free: 2.5 GB) 19/08/13 19:53:26 INFO BlockManagerInfo: Added broadcast_0_piece0 in memory on datanode01:20487 (size: 2011.0 B, free: 2.5 GB) 19/08/13 19:53:26 INFO TaskSetManager: Starting task 6.0 in stage 0.0 (TID 6, datanode02, executor 3, partition 6, PROCESS_LOCAL, 7831 bytes) 19/08/13 19:53:26 INFO TaskSetManager: Starting task 7.0 in stage 0.0 (TID 7, datanode02, executor 3, partition 7, PROCESS_LOCAL, 7831 bytes) 19/08/13 19:53:26 INFO TaskSetManager: Finished task 3.0 in stage 0.0 (TID 3) in 693 ms on datanode02 (executor 3) (1/50) 19/08/13 19:53:26 INFO TaskSetManager: Finished task 0.0 in stage 0.0 (TID 0) in 712 ms on datanode02 (executor 3) (2/50) 19/08/13 19:53:26 INFO TaskSetManager: Starting task 8.0 in stage 0.0 (TID 8, datanode02, executor 3, partition 8, PROCESS_LOCAL, 7831 bytes) 19/08/13 19:53:26 INFO TaskSetManager: Finished task 7.0 in stage 0.0 (TID 7) in 21 ms on datanode02 (executor 3) (3/50) 19/08/13 19:53:26 INFO TaskSetManager: Starting task 9.0 in stage 0.0 (TID 9, datanode02, executor 3, partition 9, PROCESS_LOCAL, 7831 bytes) 19/08/13 19:53:26 INFO TaskSetManager: Finished task 6.0 in stage 0.0 (TID 6) in 26 ms on datanode02 (executor 3) (4/50) 19/08/13 19:53:26 INFO TaskSetManager: Starting task 10.0 in stage 0.0 (TID 10, datanode02, executor 3, partition 10, PROCESS_LOCAL, 7831 bytes) 19/08/13 19:53:26 INFO TaskSetManager: Finished task 8.0 in stage 0.0 (TID 8) in 23 ms on datanode02 (executor 3) (5/50) 19/08/13 19:53:26 INFO TaskSetManager: Starting task 11.0 in stage 0.0 (TID 11, datanode02, executor 3, partition 11, PROCESS_LOCAL, 7831 bytes) 19/08/13 19:53:26 INFO TaskSetManager: Finished task 9.0 in stage 0.0 (TID 9) in 25 ms on datanode02 (executor 3) (6/50) 19/08/13 19:53:26 INFO TaskSetManager: Starting task 12.0 in stage 0.0 (TID 12, datanode02, executor 3, partition 12, PROCESS_LOCAL, 7831 bytes) 19/08/13 19:53:26 INFO TaskSetManager: Finished task 10.0 in stage 0.0 (TID 10) in 18 ms on datanode02 (executor 3) (7/50) 19/08/13 19:53:26 INFO TaskSetManager: Finished task 11.0 in stage 0.0 (TID 11) in 14 ms on datanode02 (executor 3) (8/50) 19/08/13 19:53:26 INFO TaskSetManager: Starting task 13.0 in stage 0.0 (TID 13, datanode02, executor 3, partition 13, PROCESS_LOCAL, 7831 bytes) 19/08/13 19:53:26 INFO TaskSetManager: Starting task 14.0 in stage 0.0 (TID 14, datanode02, executor 3, partition 14, PROCESS_LOCAL, 7831 bytes) 19/08/13 19:53:26 INFO TaskSetManager: Finished task 12.0 in stage 0.0 (TID 12) in 16 ms on datanode02 (executor 3) (9/50) 19/08/13 19:53:26 INFO TaskSetManager: Starting task 15.0 in stage 0.0 (TID 15, datanode02, executor 3, partition 15, PROCESS_LOCAL, 7831 bytes) 19/08/13 19:53:26 INFO TaskSetManager: Finished task 13.0 in stage 0.0 (TID 13) in 22 ms on datanode02 (executor 3) (10/50) 19/08/13 19:53:26 INFO TaskSetManager: Starting task 16.0 in stage 0.0 (TID 16, datanode02, executor 3, partition 16, PROCESS_LOCAL, 7831 bytes) 19/08/13 19:53:26 INFO TaskSetManager: Finished task 14.0 in stage 0.0 (TID 14) in 16 ms on datanode02 (executor 3) (11/50) 19/08/13 19:53:26 INFO TaskSetManager: Starting task 17.0 in stage 0.0 (TID 17, datanode02, executor 3, partition 17, PROCESS_LOCAL, 7831 bytes) 19/08/13 19:53:26 INFO TaskSetManager: Finished task 15.0 in stage 0.0 (TID 15) in 13 ms on datanode02 (executor 3) (12/50) 19/08/13 19:53:26 INFO TaskSetManager: Starting task 18.0 in stage 0.0 (TID 18, datanode01, executor 2, partition 18, PROCESS_LOCAL, 7831 bytes) 19/08/13 19:53:26 INFO TaskSetManager: Starting task 19.0 in stage 0.0 (TID 19, datanode01, executor 2, partition 19, PROCESS_LOCAL, 7831 bytes) 19/08/13 19:53:26 INFO TaskSetManager: Finished task 5.0 in stage 0.0 (TID 5) in 787 ms on datanode01 (executor 2) (13/50) 19/08/13 19:53:26 INFO TaskSetManager: Finished task 2.0 in stage 0.0 (TID 2) in 789 ms on datanode01 (executor 2) (14/50) 19/08/13 19:53:26 INFO TaskSetManager: Starting task 20.0 in stage 0.0 (TID 20, datanode03, executor 1, partition 20, PROCESS_LOCAL, 7831 bytes) 19/08/13 19:53:26 INFO TaskSetManager: Starting task 21.0 in stage 0.0 (TID 21, datanode03, executor 1, partition 21, PROCESS_LOCAL, 7831 bytes) 19/08/13 19:53:27 INFO TaskSetManager: Finished task 4.0 in stage 0.0 (TID 4) in 905 ms on datanode03 (executor 1) (15/50) 19/08/13 19:53:27 INFO TaskSetManager: Finished task 1.0 in stage 0.0 (TID 1) in 907 ms on datanode03 (executor 1) (16/50) 19/08/13 19:53:27 INFO TaskSetManager: Starting task 22.0 in stage 0.0 (TID 22, datanode02, executor 3, partition 22, PROCESS_LOCAL, 7831 bytes) 19/08/13 19:53:27 INFO TaskSetManager: Starting task 23.0 in stage 0.0 (TID 23, datanode02, executor 3, partition 23, PROCESS_LOCAL, 7831 bytes) 19/08/13 19:53:27 INFO TaskSetManager: Starting task 24.0 in stage 0.0 (TID 24, datanode01, executor 2, partition 24, PROCESS_LOCAL, 7831 bytes) 19/08/13 19:53:27 INFO TaskSetManager: Finished task 18.0 in stage 0.0 (TID 18) in 124 ms on datanode01 (executor 2) (17/50) 19/08/13 19:53:27 INFO TaskSetManager: Finished task 16.0 in stage 0.0 (TID 16) in 134 ms on datanode02 (executor 3) (18/50) 19/08/13 19:53:27 INFO TaskSetManager: Starting task 25.0 in stage 0.0 (TID 25, datanode01, executor 2, partition 25, PROCESS_LOCAL, 7831 bytes) 19/08/13 19:53:27 INFO TaskSetManager: Starting task 26.0 in stage 0.0 (TID 26, datanode03, executor 1, partition 26, PROCESS_LOCAL, 7831 bytes) 19/08/13 19:53:27 INFO TaskSetManager: Finished task 17.0 in stage 0.0 (TID 17) in 134 ms on datanode02 (executor 3) (19/50) 19/08/13 19:53:27 INFO TaskSetManager: Finished task 20.0 in stage 0.0 (TID 20) in 122 ms on datanode03 (executor 1) (20/50) 19/08/13 19:53:27 INFO TaskSetManager: Starting task 27.0 in stage 0.0 (TID 27, datanode03, executor 1, partition 27, PROCESS_LOCAL, 7831 bytes) 19/08/13 19:53:27 INFO TaskSetManager: Finished task 19.0 in stage 0.0 (TID 19) in 127 ms on datanode01 (executor 2) (21/50) 19/08/13 19:53:27 INFO TaskSetManager: Finished task 21.0 in stage 0.0 (TID 21) in 123 ms on datanode03 (executor 1) (22/50) 19/08/13 19:53:27 INFO TaskSetManager: Starting task 28.0 in stage 0.0 (TID 28, datanode02, executor 3, partition 28, PROCESS_LOCAL, 7831 bytes) 19/08/13 19:53:27 INFO TaskSetManager: Starting task 29.0 in stage 0.0 (TID 29, datanode02, executor 3, partition 29, PROCESS_LOCAL, 7831 bytes) 19/08/13 19:53:27 INFO TaskSetManager: Finished task 22.0 in stage 0.0 (TID 22) in 19 ms on datanode02 (executor 3) (23/50) 19/08/13 19:53:27 INFO TaskSetManager: Finished task 23.0 in stage 0.0 (TID 23) in 18 ms on datanode02 (executor 3) (24/50) 19/08/13 19:53:27 INFO TaskSetManager: Starting task 30.0 in stage 0.0 (TID 30, datanode01, executor 2, partition 30, PROCESS_LOCAL, 7831 bytes) 19/08/13 19:53:27 INFO TaskSetManager: Starting task 31.0 in stage 0.0 (TID 31, datanode01, executor 2, partition 31, PROCESS_LOCAL, 7831 bytes) 19/08/13 19:53:27 INFO TaskSetManager: Finished task 25.0 in stage 0.0 (TID 25) in 27 ms on datanode01 (executor 2) (25/50) 19/08/13 19:53:27 INFO TaskSetManager: Finished task 24.0 in stage 0.0 (TID 24) in 29 ms on datanode01 (executor 2) (26/50) 19/08/13 19:53:27 INFO TaskSetManager: Starting task 32.0 in stage 0.0 (TID 32, datanode02, executor 3, partition 32, PROCESS_LOCAL, 7831 bytes) 19/08/13 19:53:27 INFO TaskSetManager: Finished task 29.0 in stage 0.0 (TID 29) in 16 ms on datanode02 (executor 3) (27/50) 19/08/13 19:53:27 INFO TaskSetManager: Starting task 33.0 in stage 0.0 (TID 33, datanode03, executor 1, partition 33, PROCESS_LOCAL, 7831 bytes) 19/08/13 19:53:27 INFO TaskSetManager: Finished task 26.0 in stage 0.0 (TID 26) in 30 ms on datanode03 (executor 1) (28/50) 19/08/13 19:53:27 INFO TaskSetManager: Starting task 34.0 in stage 0.0 (TID 34, datanode02, executor 3, partition 34, PROCESS_LOCAL, 7831 bytes) 19/08/13 19:53:27 INFO TaskSetManager: Finished task 28.0 in stage 0.0 (TID 28) in 21 ms on datanode02 (executor 3) (29/50) 19/08/13 19:53:27 INFO TaskSetManager: Starting task 35.0 in stage 0.0 (TID 35, datanode03, executor 1, partition 35, PROCESS_LOCAL, 7831 bytes) 19/08/13 19:53:27 INFO TaskSetManager: Finished task 27.0 in stage 0.0 (TID 27) in 32 ms on datanode03 (executor 1) (30/50) 19/08/13 19:53:27 INFO TaskSetManager: Starting task 36.0 in stage 0.0 (TID 36, datanode02, executor 3, partition 36, PROCESS_LOCAL, 7831 bytes) 19/08/13 19:53:27 INFO TaskSetManager: Finished task 32.0 in stage 0.0 (TID 32) in 11 ms on datanode02 (executor 3) (31/50) 19/08/13 19:53:27 INFO TaskSetManager: Starting task 37.0 in stage 0.0 (TID 37, datanode01, executor 2, partition 37, PROCESS_LOCAL, 7831 bytes) 19/08/13 19:53:27 INFO TaskSetManager: Finished task 30.0 in stage 0.0 (TID 30) in 18 ms on datanode01 (executor 2) (32/50) 19/08/13 19:53:27 INFO TaskSetManager: Starting task 38.0 in stage 0.0 (TID 38, datanode01, executor 2, partition 38, PROCESS_LOCAL, 7831 bytes) 19/08/13 19:53:27 INFO TaskSetManager: Finished task 31.0 in stage 0.0 (TID 31) in 20 ms on datanode01 (executor 2) (33/50) 19/08/13 19:53:27 INFO TaskSetManager: Starting task 39.0 in stage 0.0 (TID 39, datanode03, executor 1, partition 39, PROCESS_LOCAL, 7831 bytes) 19/08/13 19:53:27 INFO TaskSetManager: Finished task 33.0 in stage 0.0 (TID 33) in 17 ms on datanode03 (executor 1) (34/50) 19/08/13 19:53:27 INFO TaskSetManager: Finished task 34.0 in stage 0.0 (TID 34) in 17 ms on datanode02 (executor 3) (35/50) 19/08/13 19:53:27 INFO TaskSetManager: Starting task 40.0 in stage 0.0 (TID 40, datanode02, executor 3, partition 40, PROCESS_LOCAL, 7831 bytes) 19/08/13 19:53:27 INFO TaskSetManager: Starting task 41.0 in stage 0.0 (TID 41, datanode03, executor 1, partition 41, PROCESS_LOCAL, 7831 bytes) 19/08/13 19:53:27 INFO TaskSetManager: Finished task 35.0 in stage 0.0 (TID 35) in 17 ms on datanode03 (executor 1) (36/50) 19/08/13 19:53:27 INFO TaskSetManager: Starting task 42.0 in stage 0.0 (TID 42, datanode02, executor 3, partition 42, PROCESS_LOCAL, 7831 bytes) 19/08/13 19:53:27 INFO TaskSetManager: Finished task 36.0 in stage 0.0 (TID 36) in 16 ms on datanode02 (executor 3) (37/50) 19/08/13 19:53:27 INFO TaskSetManager: Starting task 43.0 in stage 0.0 (TID 43, datanode01, executor 2, partition 43, PROCESS_LOCAL, 7831 bytes) 19/08/13 19:53:27 INFO TaskSetManager: Finished task 37.0 in stage 0.0 (TID 37) in 16 ms on datanode01 (executor 2) (38/50) 19/08/13 19:53:27 INFO TaskSetManager: Starting task 44.0 in stage 0.0 (TID 44, datanode02, executor 3, partition 44, PROCESS_LOCAL, 7831 bytes) 19/08/13 19:53:27 INFO TaskSetManager: Starting task 45.0 in stage 0.0 (TID 45, datanode02, executor 3, partition 45, PROCESS_LOCAL, 7831 bytes) 19/08/13 19:53:27 INFO TaskSetManager: Finished task 40.0 in stage 0.0 (TID 40) in 14 ms on datanode02 (executor 3) (39/50) 19/08/13 19:53:27 INFO TaskSetManager: Finished task 42.0 in stage 0.0 (TID 42) in 11 ms on datanode02 (executor 3) (40/50) 19/08/13 19:53:27 INFO TaskSetManager: Starting task 46.0 in stage 0.0 (TID 46, datanode03, executor 1, partition 46, PROCESS_LOCAL, 7831 bytes) 19/08/13 19:53:27 INFO TaskSetManager: Finished task 39.0 in stage 0.0 (TID 39) in 20 ms on datanode03 (executor 1) (41/50) 19/08/13 19:53:27 INFO TaskSetManager: Starting task 47.0 in stage 0.0 (TID 47, datanode03, executor 1, partition 47, PROCESS_LOCAL, 7831 bytes) 19/08/13 19:53:27 INFO TaskSetManager: Finished task 41.0 in stage 0.0 (TID 41) in 20 ms on datanode03 (executor 1) (42/50) 19/08/13 19:53:27 INFO TaskSetManager: Starting task 48.0 in stage 0.0 (TID 48, datanode01, executor 2, partition 48, PROCESS_LOCAL, 7831 bytes) 19/08/13 19:53:27 INFO TaskSetManager: Starting task 49.0 in stage 0.0 (TID 49, datanode01, executor 2, partition 49, PROCESS_LOCAL, 7888 bytes) 19/08/13 19:53:27 INFO TaskSetManager: Finished task 43.0 in stage 0.0 (TID 43) in 18 ms on datanode01 (executor 2) (43/50) 19/08/13 19:53:27 INFO TaskSetManager: Finished task 38.0 in stage 0.0 (TID 38) in 31 ms on datanode01 (executor 2) (44/50) 19/08/13 19:53:27 INFO TaskSetManager: Finished task 45.0 in stage 0.0 (TID 45) in 11 ms on datanode02 (executor 3) (45/50) 19/08/13 19:53:27 INFO TaskSetManager: Finished task 44.0 in stage 0.0 (TID 44) in 16 ms on datanode02 (executor 3) (46/50) 19/08/13 19:53:27 INFO TaskSetManager: Finished task 46.0 in stage 0.0 (TID 46) in 18 ms on datanode03 (executor 1) (47/50) 19/08/13 19:53:27 INFO TaskSetManager: Finished task 48.0 in stage 0.0 (TID 48) in 15 ms on datanode01 (executor 2) (48/50) 19/08/13 19:53:27 INFO TaskSetManager: Finished task 47.0 in stage 0.0 (TID 47) in 15 ms on datanode03 (executor 1) (49/50) 19/08/13 19:53:27 INFO TaskSetManager: Finished task 49.0 in stage 0.0 (TID 49) in 25 ms on datanode01 (executor 2) (50/50) 19/08/13 19:53:27 INFO YarnClusterScheduler: Removed TaskSet 0.0, whose tasks have all completed, from pool 19/08/13 19:53:27 INFO DAGScheduler: ShuffleMapStage 0 (start at VoiceApplication2.java:128) finished in 1.174 s 19/08/13 19:53:27 INFO DAGScheduler: looking for newly runnable stages 19/08/13 19:53:27 INFO DAGScheduler: running: Set() 19/08/13 19:53:27 INFO DAGScheduler: waiting: Set(ResultStage 1) 19/08/13 19:53:27 INFO DAGScheduler: failed: Set() 19/08/13 19:53:27 INFO DAGScheduler: Submitting ResultStage 1 (ShuffledRDD[2] at start at VoiceApplication2.java:128), which has no missing parents 19/08/13 19:53:27 INFO MemoryStore: Block broadcast_1 stored as values in memory (estimated size 3.2 KB, free 366.3 MB) 19/08/13 19:53:27 INFO MemoryStore: Block broadcast_1_piece0 stored as bytes in memory (estimated size 1979.0 B, free 366.3 MB) 19/08/13 19:53:27 INFO BlockManagerInfo: Added broadcast_1_piece0 in memory on datanode02:31984 (size: 1979.0 B, free: 366.3 MB) 19/08/13 19:53:27 INFO SparkContext: Created broadcast 1 from broadcast at DAGScheduler.scala:1039 19/08/13 19:53:27 INFO DAGScheduler: Submitting 20 missing tasks from ResultStage 1 (ShuffledRDD[2] at start at VoiceApplication2.java:128) (first 15 tasks are for partitions Vector(0, 1, 2, 3, 4, 5, 6, 7, 8, 9, 10, 11, 12, 13, 14)) 19/08/13 19:53:27 INFO YarnClusterScheduler: Adding task set 1.0 with 20 tasks 19/08/13 19:53:27 INFO TaskSetManager: Starting task 0.0 in stage 1.0 (TID 50, datanode03, executor 1, partition 0, NODE_LOCAL, 7638 bytes) 19/08/13 19:53:27 INFO TaskSetManager: Starting task 1.0 in stage 1.0 (TID 51, datanode02, executor 3, partition 1, NODE_LOCAL, 7638 bytes) 19/08/13 19:53:27 INFO TaskSetManager: Starting task 3.0 in stage 1.0 (TID 52, datanode01, executor 2, partition 3, NODE_LOCAL, 7638 bytes) 19/08/13 19:53:27 INFO TaskSetManager: Starting task 2.0 in stage 1.0 (TID 53, datanode03, executor 1, partition 2, NODE_LOCAL, 7638 bytes) 19/08/13 19:53:27 INFO TaskSetManager: Starting task 4.0 in stage 1.0 (TID 54, datanode02, executor 3, partition 4, NODE_LOCAL, 7638 bytes) 19/08/13 19:53:27 INFO TaskSetManager: Starting task 5.0 in stage 1.0 (TID 55, datanode01, executor 2, partition 5, NODE_LOCAL, 7638 bytes) 19/08/13 19:53:27 INFO BlockManagerInfo: Added broadcast_1_piece0 in memory on datanode02:28863 (size: 1979.0 B, free: 2.5 GB) 19/08/13 19:53:27 INFO BlockManagerInfo: Added broadcast_1_piece0 in memory on datanode01:20487 (size: 1979.0 B, free: 2.5 GB) 19/08/13 19:53:27 INFO BlockManagerInfo: Added broadcast_1_piece0 in memory on datanode03:3328 (size: 1979.0 B, free: 2.5 GB) 19/08/13 19:53:27 INFO MapOutputTrackerMasterEndpoint: Asked to send map output locations for shuffle 0 to 10.1.229.163:24656 19/08/13 19:53:27 INFO MapOutputTrackerMasterEndpoint: Asked to send map output locations for shuffle 0 to 10.1.198.144:41122 19/08/13 19:53:27 INFO MapOutputTrackerMasterEndpoint: Asked to send map output locations for shuffle 0 to 10.1.229.158:64276 19/08/13 19:53:27 INFO TaskSetManager: Starting task 7.0 in stage 1.0 (TID 56, datanode03, executor 1, partition 7, NODE_LOCAL, 7638 bytes) 19/08/13 19:53:27 INFO TaskSetManager: Finished task 2.0 in stage 1.0 (TID 53) in 192 ms on datanode03 (executor 1) (1/20) 19/08/13 19:53:27 INFO TaskSetManager: Starting task 8.0 in stage 1.0 (TID 57, datanode03, executor 1, partition 8, NODE_LOCAL, 7638 bytes) 19/08/13 19:53:27 INFO TaskSetManager: Finished task 7.0 in stage 1.0 (TID 56) in 25 ms on datanode03 (executor 1) (2/20) 19/08/13 19:53:27 INFO TaskSetManager: Starting task 6.0 in stage 1.0 (TID 58, datanode02, executor 3, partition 6, NODE_LOCAL, 7638 bytes) 19/08/13 19:53:27 INFO TaskSetManager: Finished task 1.0 in stage 1.0 (TID 51) in 220 ms on datanode02 (executor 3) (3/20) 19/08/13 19:53:27 INFO TaskSetManager: Starting task 14.0 in stage 1.0 (TID 59, datanode03, executor 1, partition 14, NODE_LOCAL, 7638 bytes) 19/08/13 19:53:27 INFO TaskSetManager: Finished task 8.0 in stage 1.0 (TID 57) in 17 ms on datanode03 (executor 1) (4/20) 19/08/13 19:53:27 INFO TaskSetManager: Starting task 16.0 in stage 1.0 (TID 60, datanode03, executor 1, partition 16, NODE_LOCAL, 7638 bytes) 19/08/13 19:53:27 INFO TaskSetManager: Finished task 14.0 in stage 1.0 (TID 59) in 15 ms on datanode03 (executor 1) (5/20) 19/08/13 19:53:27 INFO TaskSetManager: Finished task 16.0 in stage 1.0 (TID 60) in 21 ms on datanode03 (executor 1) (6/20) 19/08/13 19:53:27 INFO TaskSetManager: Starting task 9.0 in stage 1.0 (TID 61, datanode02, executor 3, partition 9, NODE_LOCAL, 7638 bytes) 19/08/13 19:53:27 INFO TaskSetManager: Finished task 4.0 in stage 1.0 (TID 54) in 269 ms on datanode02 (executor 3) (7/20) 19/08/13 19:53:27 INFO TaskSetManager: Finished task 0.0 in stage 1.0 (TID 50) in 339 ms on datanode03 (executor 1) (8/20) 19/08/13 19:53:27 INFO TaskSetManager: Starting task 10.0 in stage 1.0 (TID 62, datanode02, executor 3, partition 10, NODE_LOCAL, 7638 bytes) 19/08/13 19:53:27 INFO TaskSetManager: Finished task 6.0 in stage 1.0 (TID 58) in 56 ms on datanode02 (executor 3) (9/20) 19/08/13 19:53:27 INFO TaskSetManager: Starting task 11.0 in stage 1.0 (TID 63, datanode01, executor 2, partition 11, NODE_LOCAL, 7638 bytes) 19/08/13 19:53:27 INFO TaskSetManager: Finished task 5.0 in stage 1.0 (TID 55) in 284 ms on datanode01 (executor 2) (10/20) 19/08/13 19:53:27 INFO TaskSetManager: Starting task 12.0 in stage 1.0 (TID 64, datanode01, executor 2, partition 12, NODE_LOCAL, 7638 bytes) 19/08/13 19:53:27 INFO TaskSetManager: Finished task 3.0 in stage 1.0 (TID 52) in 287 ms on datanode01 (executor 2) (11/20) 19/08/13 19:53:27 INFO TaskSetManager: Starting task 13.0 in stage 1.0 (TID 65, datanode02, executor 3, partition 13, NODE_LOCAL, 7638 bytes) 19/08/13 19:53:27 INFO TaskSetManager: Starting task 15.0 in stage 1.0 (TID 66, datanode02, executor 3, partition 15, NODE_LOCAL, 7638 bytes) 19/08/13 19:53:27 INFO TaskSetManager: Finished task 10.0 in stage 1.0 (TID 62) in 25 ms on datanode02 (executor 3) (12/20) 19/08/13 19:53:27 INFO TaskSetManager: Finished task 9.0 in stage 1.0 (TID 61) in 29 ms on datanode02 (executor 3) (13/20) 19/08/13 19:53:27 INFO TaskSetManager: Starting task 17.0 in stage 1.0 (TID 67, datanode02, executor 3, partition 17, NODE_LOCAL, 7638 bytes) 19/08/13 19:53:27 INFO TaskSetManager: Finished task 15.0 in stage 1.0 (TID 66) in 13 ms on datanode02 (executor 3) (14/20) 19/08/13 19:53:27 INFO TaskSetManager: Finished task 13.0 in stage 1.0 (TID 65) in 16 ms on datanode02 (executor 3) (15/20) 19/08/13 19:53:27 INFO TaskSetManager: Starting task 18.0 in stage 1.0 (TID 68, datanode02, executor 3, partition 18, NODE_LOCAL, 7638 bytes) 19/08/13 19:53:27 INFO TaskSetManager: Starting task 19.0 in stage 1.0 (TID 69, datanode01, executor 2, partition 19, NODE_LOCAL, 7638 bytes) 19/08/13 19:53:27 INFO TaskSetManager: Finished task 11.0 in stage 1.0 (TID 63) in 30 ms on datanode01 (executor 2) (16/20) 19/08/13 19:53:27 INFO TaskSetManager: Finished task 12.0 in stage 1.0 (TID 64) in 30 ms on datanode01 (executor 2) (17/20) 19/08/13 19:53:27 INFO TaskSetManager: Finished task 17.0 in stage 1.0 (TID 67) in 17 ms on datanode02 (executor 3) (18/20) 19/08/13 19:53:27 INFO TaskSetManager: Finished task 19.0 in stage 1.0 (TID 69) in 13 ms on datanode01 (executor 2) (19/20) 19/08/13 19:53:27 INFO TaskSetManager: Finished task 18.0 in stage 1.0 (TID 68) in 20 ms on datanode02 (executor 3) (20/20) 19/08/13 19:53:27 INFO YarnClusterScheduler: Removed TaskSet 1.0, whose tasks have all completed, from pool 19/08/13 19:53:27 INFO DAGScheduler: ResultStage 1 (start at VoiceApplication2.java:128) finished in 0.406 s 19/08/13 19:53:27 INFO DAGScheduler: Job 0 finished: start at VoiceApplication2.java:128, took 1.850883 s 19/08/13 19:53:27 INFO ReceiverTracker: Starting 1 receivers 19/08/13 19:53:27 INFO ReceiverTracker: ReceiverTracker started 19/08/13 19:53:27 INFO KafkaInputDStream: Slide time = 60000 ms 19/08/13 19:53:27 INFO KafkaInputDStream: Storage level = Serialized 1x Replicated 19/08/13 19:53:27 INFO KafkaInputDStream: Checkpoint interval = null 19/08/13 19:53:27 INFO KafkaInputDStream: Remember interval = 60000 ms 19/08/13 19:53:27 INFO KafkaInputDStream: Initialized and validated org.apache.spark.streaming.kafka.KafkaInputDStream@5fd3dc81 19/08/13 19:53:27 INFO ForEachDStream: Slide time = 60000 ms 19/08/13 19:53:27 INFO ForEachDStream: Storage level = Serialized 1x Replicated 19/08/13 19:53:27 INFO ForEachDStream: Checkpoint interval = null 19/08/13 19:53:27 INFO ForEachDStream: Remember interval = 60000 ms 19/08/13 19:53:27 INFO ForEachDStream: Initialized and validated org.apache.spark.streaming.dstream.ForEachDStream@4044ec97 19/08/13 19:53:27 INFO KafkaInputDStream: Slide time = 60000 ms 19/08/13 19:53:27 INFO KafkaInputDStream: Storage level = Serialized 1x Replicated 19/08/13 19:53:27 INFO KafkaInputDStream: Checkpoint interval = null 19/08/13 19:53:27 INFO KafkaInputDStream: Remember interval = 60000 ms 19/08/13 19:53:27 INFO KafkaInputDStream: Initialized and validated org.apache.spark.streaming.kafka.KafkaInputDStream@5fd3dc81 19/08/13 19:53:27 INFO MappedDStream: Slide time = 60000 ms 19/08/13 19:53:27 INFO MappedDStream: Storage level = Serialized 1x Replicated 19/08/13 19:53:27 INFO MappedDStream: Checkpoint interval = null 19/08/13 19:53:27 INFO MappedDStream: Remember interval = 60000 ms 19/08/13 19:53:27 INFO MappedDStream: Initialized and validated org.apache.spark.streaming.dstream.MappedDStream@5dd4b960 19/08/13 19:53:27 INFO ForEachDStream: Slide time = 60000 ms 19/08/13 19:53:27 INFO ForEachDStream: Storage level = Serialized 1x Replicated 19/08/13 19:53:27 INFO ForEachDStream: Checkpoint interval = null 19/08/13 19:53:27 INFO ForEachDStream: Remember interval = 60000 ms 19/08/13 19:53:27 INFO ForEachDStream: Initialized and validated org.apache.spark.streaming.dstream.ForEachDStream@132d0c3c 19/08/13 19:53:27 INFO KafkaInputDStream: Slide time = 60000 ms 19/08/13 19:53:27 INFO KafkaInputDStream: Storage level = Serialized 1x Replicated 19/08/13 19:53:27 INFO KafkaInputDStream: Checkpoint interval = null 19/08/13 19:53:27 INFO KafkaInputDStream: Remember interval = 60000 ms 19/08/13 19:53:27 INFO KafkaInputDStream: Initialized and validated org.apache.spark.streaming.kafka.KafkaInputDStream@5fd3dc81 19/08/13 19:53:27 INFO MappedDStream: Slide time = 60000 ms 19/08/13 19:53:27 INFO MappedDStream: Storage level = Serialized 1x Replicated 19/08/13 19:53:27 INFO MappedDStream: Checkpoint interval = null 19/08/13 19:53:27 INFO MappedDStream: Remember interval = 60000 ms 19/08/13 19:53:27 INFO MappedDStream: Initialized and validated org.apache.spark.streaming.dstream.MappedDStream@5dd4b960 19/08/13 19:53:27 INFO ForEachDStream: Slide time = 60000 ms 19/08/13 19:53:27 INFO ForEachDStream: Storage level = Serialized 1x Replicated 19/08/13 19:53:27 INFO ForEachDStream: Checkpoint interval = null 19/08/13 19:53:27 INFO ForEachDStream: Remember interval = 60000 ms 19/08/13 19:53:27 INFO ForEachDStream: Initialized and validated org.apache.spark.streaming.dstream.ForEachDStream@525bed0c 19/08/13 19:53:27 INFO DAGScheduler: Got job 1 (start at VoiceApplication2.java:128) with 1 output partitions 19/08/13 19:53:27 INFO DAGScheduler: Final stage: ResultStage 2 (start at VoiceApplication2.java:128) 19/08/13 19:53:27 INFO DAGScheduler: Parents of final stage: List() 19/08/13 19:53:27 INFO DAGScheduler: Missing parents: List() 19/08/13 19:53:27 INFO DAGScheduler: Submitting ResultStage 2 (Receiver 0 ParallelCollectionRDD[3] at makeRDD at ReceiverTracker.scala:613), which has no missing parents 19/08/13 19:53:27 INFO ReceiverTracker: Receiver 0 started 19/08/13 19:53:27 INFO MemoryStore: Block broadcast_2 stored as values in memory (estimated size 133.5 KB, free 366.2 MB) 19/08/13 19:53:27 INFO MemoryStore: Block broadcast_2_piece0 stored as bytes in memory (estimated size 36.3 KB, free 366.1 MB) 19/08/13 19:53:27 INFO BlockManagerInfo: Added broadcast_2_piece0 in memory on datanode02:31984 (size: 36.3 KB, free: 366.3 MB) 19/08/13 19:53:27 INFO SparkContext: Created broadcast 2 from broadcast at DAGScheduler.scala:1039 19/08/13 19:53:27 INFO DAGScheduler: Submitting 1 missing tasks from ResultStage 2 (Receiver 0 ParallelCollectionRDD[3] at makeRDD at ReceiverTracker.scala:613) (first 15 tasks are for partitions Vector(0)) 19/08/13 19:53:27 INFO YarnClusterScheduler: Adding task set 2.0 with 1 tasks 19/08/13 19:53:27 INFO TaskSetManager: Starting task 0.0 in stage 2.0 (TID 70, datanode01, executor 2, partition 0, PROCESS_LOCAL, 8757 bytes) 19/08/13 19:53:27 INFO RecurringTimer: Started timer for JobGenerator at time 1565697240000 19/08/13 19:53:27 INFO JobGenerator: Started JobGenerator at 1565697240000 ms 19/08/13 19:53:27 INFO JobScheduler: Started JobScheduler 19/08/13 19:53:27 INFO StreamingContext: StreamingContext started 19/08/13 19:53:27 INFO BlockManagerInfo: Added broadcast_2_piece0 in memory on datanode01:20487 (size: 36.3 KB, free: 2.5 GB) 19/08/13 19:53:27 INFO ReceiverTracker: Registered receiver for stream 0 from 10.1.229.158:64276 19/08/13 19:54:00 INFO JobScheduler: Added jobs for time 1565697240000 ms 19/08/13 19:54:00 INFO JobScheduler: Starting job streaming job 1565697240000 ms.0 from job set of time 1565697240000 ms 19/08/13 19:54:00 INFO JobScheduler: Starting job streaming job 1565697240000 ms.1 from job set of time 1565697240000 ms 19/08/13 19:54:00 INFO JobScheduler: Finished job streaming job 1565697240000 ms.1 from job set of time 1565697240000 ms 19/08/13 19:54:00 INFO JobScheduler: Finished job streaming job 1565697240000 ms.0 from job set of time 1565697240000 ms 19/08/13 19:54:00 INFO JobScheduler: Starting job streaming job 1565697240000 ms.2 from job set of time 1565697240000 ms 19/08/13 19:54:00 INFO SharedState: loading hive config file: file:/data01/hadoop/yarn/local/usercache/hdfs/filecache/85431/__spark_conf__.zip/__hadoop_conf__/hive-site.xml 19/08/13 19:54:00 INFO SharedState: Setting hive.metastore.warehouse.dir ('null') to the value of spark.sql.warehouse.dir ('hdfs://CID-042fb939-95b4-4b74-91b8-9f94b999bdf7/apps/hive/warehouse'). 19/08/13 19:54:00 INFO SharedState: Warehouse path is 'hdfs://CID-042fb939-95b4-4b74-91b8-9f94b999bdf7/apps/hive/warehouse'. 19/08/13 19:54:00 INFO StateStoreCoordinatorRef: Registered StateStoreCoordinator endpoint 19/08/13 19:54:00 INFO BlockManagerInfo: Removed broadcast_1_piece0 on datanode02:31984 in memory (size: 1979.0 B, free: 366.3 MB) 19/08/13 19:54:00 INFO BlockManagerInfo: Removed broadcast_1_piece0 on datanode02:28863 in memory (size: 1979.0 B, free: 2.5 GB) 19/08/13 19:54:00 INFO BlockManagerInfo: Removed broadcast_1_piece0 on datanode01:20487 in memory (size: 1979.0 B, free: 2.5 GB) 19/08/13 19:54:00 INFO BlockManagerInfo: Removed broadcast_1_piece0 on datanode03:3328 in memory (size: 1979.0 B, free: 2.5 GB) 19/08/13 19:54:02 INFO CodeGenerator: Code generated in 175.416957 ms 19/08/13 19:54:02 INFO JobScheduler: Finished job streaming job 1565697240000 ms.2 from job set of time 1565697240000 ms 19/08/13 19:54:02 ERROR JobScheduler: Error running job streaming job 1565697240000 ms.2 org.apache.spark.sql.catalyst.analysis.NoSuchDatabaseException: Database 'meta_voice' not found; at org.apache.spark.sql.catalyst.catalog.ExternalCatalog.requireDbExists(ExternalCatalog.scala:40) at org.apache.spark.sql.catalyst.catalog.InMemoryCatalog.tableExists(InMemoryCatalog.scala:331) at org.apache.spark.sql.catalyst.catalog.SessionCatalog.tableExists(SessionCatalog.scala:388) at org.apache.spark.sql.DataFrameWriter.saveAsTable(DataFrameWriter.scala:398) at org.apache.spark.sql.DataFrameWriter.saveAsTable(DataFrameWriter.scala:393) at com.stream.VoiceApplication2$2.call(VoiceApplication2.java:122) at com.stream.VoiceApplication2$2.call(VoiceApplication2.java:115) at org.apache.spark.streaming.api.java.JavaDStreamLike$$anonfun$foreachRDD$2.apply(JavaDStreamLike.scala:280) at org.apache.spark.streaming.api.java.JavaDStreamLike$$anonfun$foreachRDD$2.apply(JavaDStreamLike.scala:280) at org.apache.spark.streaming.dstream.ForEachDStream$$anonfun$1$$anonfun$apply$mcV$sp$1.apply$mcV$sp(ForEachDStream.scala:51) at org.apache.spark.streaming.dstream.ForEachDStream$$anonfun$1$$anonfun$apply$mcV$sp$1.apply(ForEachDStream.scala:51) at org.apache.spark.streaming.dstream.ForEachDStream$$anonfun$1$$anonfun$apply$mcV$sp$1.apply(ForEachDStream.scala:51) at org.apache.spark.streaming.dstream.DStream.createRDDWithLocalProperties(DStream.scala:416) at org.apache.spark.streaming.dstream.ForEachDStream$$anonfun$1.apply$mcV$sp(ForEachDStream.scala:50) at org.apache.spark.streaming.dstream.ForEachDStream$$anonfun$1.apply(ForEachDStream.scala:50) at org.apache.spark.streaming.dstream.ForEachDStream$$anonfun$1.apply(ForEachDStream.scala:50) at scala.util.Try$.apply(Try.scala:192) at org.apache.spark.streaming.scheduler.Job.run(Job.scala:39) at org.apache.spark.streaming.scheduler.JobScheduler$JobHandler$$anonfun$run$1.apply$mcV$sp(JobScheduler.scala:257) at org.apache.spark.streaming.scheduler.JobScheduler$JobHandler$$anonfun$run$1.apply(JobScheduler.scala:257) at org.apache.spark.streaming.scheduler.JobScheduler$JobHandler$$anonfun$run$1.apply(JobScheduler.scala:257) at scala.util.DynamicVariable.withValue(DynamicVariable.scala:58) at org.apache.spark.streaming.scheduler.JobScheduler$JobHandler.run(JobScheduler.scala:256) at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1149) at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:624) at java.lang.Thread.run(Thread.java:748) 19/08/13 19:54:02 ERROR ApplicationMaster: User class threw exception: org.apache.spark.sql.catalyst.analysis.NoSuchDatabaseException: Database 'meta_voice' not found; org.apache.spark.sql.catalyst.analysis.NoSuchDatabaseException: Database 'meta_voice' not found; at org.apache.spark.sql.catalyst.catalog.ExternalCatalog.requireDbExists(ExternalCatalog.scala:40) at org.apache.spark.sql.catalyst.catalog.InMemoryCatalog.tableExists(InMemoryCatalog.scala:331) at org.apache.spark.sql.catalyst.catalog.SessionCatalog.tableExists(SessionCatalog.scala:388) at org.apache.spark.sql.DataFrameWriter.saveAsTable(DataFrameWriter.scala:398) at org.apache.spark.sql.DataFrameWriter.saveAsTable(DataFrameWriter.scala:393) at com.stream.VoiceApplication2$2.call(VoiceApplication2.java:122) at com.stream.VoiceApplication2$2.call(VoiceApplication2.java:115) at org.apache.spark.streaming.api.java.JavaDStreamLike$$anonfun$foreachRDD$2.apply(JavaDStreamLike.scala:280) at org.apache.spark.streaming.api.java.JavaDStreamLike$$anonfun$foreachRDD$2.apply(JavaDStreamLike.scala:280) at org.apache.spark.streaming.dstream.ForEachDStream$$anonfun$1$$anonfun$apply$mcV$sp$1.apply$mcV$sp(ForEachDStream.scala:51) at org.apache.spark.streaming.dstream.ForEachDStream$$anonfun$1$$anonfun$apply$mcV$sp$1.apply(ForEachDStream.scala:51) at org.apache.spark.streaming.dstream.ForEachDStream$$anonfun$1$$anonfun$apply$mcV$sp$1.apply(ForEachDStream.scala:51) at org.apache.spark.streaming.dstream.DStream.createRDDWithLocalProperties(DStream.scala:416) at org.apache.spark.streaming.dstream.ForEachDStream$$anonfun$1.apply$mcV$sp(ForEachDStream.scala:50) at org.apache.spark.streaming.dstream.ForEachDStream$$anonfun$1.apply(ForEachDStream.scala:50) at org.apache.spark.streaming.dstream.ForEachDStream$$anonfun$1.apply(ForEachDStream.scala:50) at scala.util.Try$.apply(Try.scala:192) at org.apache.spark.streaming.scheduler.Job.run(Job.scala:39) at org.apache.spark.streaming.scheduler.JobScheduler$JobHandler$$anonfun$run$1.apply$mcV$sp(JobScheduler.scala:257) at org.apache.spark.streaming.scheduler.JobScheduler$JobHandler$$anonfun$run$1.apply(JobScheduler.scala:257) at org.apache.spark.streaming.scheduler.JobScheduler$JobHandler$$anonfun$run$1.apply(JobScheduler.scala:257) at scala.util.DynamicVariable.withValue(DynamicVariable.scala:58) at org.apache.spark.streaming.scheduler.JobScheduler$JobHandler.run(JobScheduler.scala:256) at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1149) at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:624) at java.lang.Thread.run(Thread.java:748) 19/08/13 19:54:02 INFO ApplicationMaster: Final app status: FAILED, exitCode: 15, (reason: User class threw exception: org.apache.spark.sql.catalyst.analysis.NoSuchDatabaseException: Database 'meta_voice' not found; at org.apache.spark.sql.catalyst.catalog.ExternalCatalog.requireDbExists(ExternalCatalog.scala:40) at org.apache.spark.sql.catalyst.catalog.InMemoryCatalog.tableExists(InMemoryCatalog.scala:331) at org.apache.spark.sql.catalyst.catalog.SessionCatalog.tableExists(SessionCatalog.scala:388) at org.apache.spark.sql.DataFrameWriter.saveAsTable(DataFrameWriter.scala:398) at org.apache.spark.sql.DataFrameWriter.saveAsTable(DataFrameWriter.scala:393) at com.stream.VoiceApplication2$2.call(VoiceApplication2.java:122) at com.stream.VoiceApplication2$2.call(VoiceApplication2.java:115) at org.apache.spark.streaming.api.java.JavaDStreamLike$$anonfun$foreachRDD$2.apply(JavaDStreamLike.scala:280) at org.apache.spark.streaming.api.java.JavaDStreamLike$$anonfun$foreachRDD$2.apply(JavaDStreamLike.scala:280) at org.apache.spark.streaming.dstream.ForEachDStream$$anonfun$1$$anonfun$apply$mcV$sp$1.apply$mcV$sp(ForEachDStream.scala:51) at org.apache.spark.streaming.dstream.ForEachDStream$$anonfun$1$$anonfun$apply$mcV$sp$1.apply(ForEachDStream.scala:51) at org.apache.spark.streaming.dstream.ForEachDStream$$anonfun$1$$anonfun$apply$mcV$sp$1.apply(ForEachDStream.scala:51) at org.apache.spark.streaming.dstream.DStream.createRDDWithLocalProperties(DStream.scala:416) at org.apache.spark.streaming.dstream.ForEachDStream$$anonfun$1.apply$mcV$sp(ForEachDStream.scala:50) at org.apache.spark.streaming.dstream.ForEachDStream$$anonfun$1.apply(ForEachDStream.scala:50) at org.apache.spark.streaming.dstream.ForEachDStream$$anonfun$1.apply(ForEachDStream.scala:50) at scala.util.Try$.apply(Try.scala:192) at org.apache.spark.streaming.scheduler.Job.run(Job.scala:39) at org.apache.spark.streaming.scheduler.JobScheduler$JobHandler$$anonfun$run$1.apply$mcV$sp(JobScheduler.scala:257) at org.apache.spark.streaming.scheduler.JobScheduler$JobHandler$$anonfun$run$1.apply(JobScheduler.scala:257) at org.apache.spark.streaming.scheduler.JobScheduler$JobHandler$$anonfun$run$1.apply(JobScheduler.scala:257) at scala.util.DynamicVariable.withValue(DynamicVariable.scala:58) at org.apache.spark.streaming.scheduler.JobScheduler$JobHandler.run(JobScheduler.scala:256) at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1149) at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:624) at java.lang.Thread.run(Thread.java:748) ) 19/08/13 19:54:02 INFO StreamingContext: Invoking stop(stopGracefully=true) from shutdown hook 19/08/13 19:54:02 INFO ReceiverTracker: Sent stop signal to all 1 receivers 19/08/13 19:54:02 ERROR ReceiverTracker: Deregistered receiver for stream 0: Stopped by driver 19/08/13 19:54:02 INFO TaskSetManager: Finished task 0.0 in stage 2.0 (TID 70) in 35055 ms on datanode01 (executor 2) (1/1) 19/08/13 19:54:02 INFO YarnClusterScheduler: Removed TaskSet 2.0, whose tasks have all completed, from pool 19/08/13 19:54:02 INFO DAGScheduler: ResultStage 2 (start at VoiceApplication2.java:128) finished in 35.086 s 19/08/13 19:54:02 INFO ReceiverTracker: Waiting for receiver job to terminate gracefully 19/08/13 19:54:02 INFO ReceiverTracker: Waited for receiver job to terminate gracefully 19/08/13 19:54:02 INFO ReceiverTracker: All of the receivers have deregistered successfully 19/08/13 19:54:02 INFO ReceiverTracker: ReceiverTracker stopped 19/08/13 19:54:02 INFO JobGenerator: Stopping JobGenerator gracefully 19/08/13 19:54:02 INFO JobGenerator: Waiting for all received blocks to be consumed for job generation 19/08/13 19:54:02 INFO JobGenerator: Waited for all received blocks to be consumed for job generation 19/08/13 19:54:12 WARN ShutdownHookManager: ShutdownHook '$anon$2' timeout, java.util.concurrent.TimeoutException java.util.concurrent.TimeoutException at java.util.concurrent.FutureTask.get(FutureTask.java:205) at org.apache.hadoop.util.ShutdownHookManager$1.run(ShutdownHookManager.java:67) 19/08/13 19:54:12 ERROR Utils: Uncaught exception in thread pool-1-thread-1 java.lang.InterruptedException at java.lang.Object.wait(Native Method) at java.lang.Thread.join(Thread.java:1252) at java.lang.Thread.join(Thread.java:1326) at org.apache.spark.streaming.util.RecurringTimer.stop(RecurringTimer.scala:86) at org.apache.spark.streaming.scheduler.JobGenerator.stop(JobGenerator.scala:137) at org.apache.spark.streaming.scheduler.JobScheduler.stop(JobScheduler.scala:123) at org.apache.spark.streaming.StreamingContext$$anonfun$stop$1.apply$mcV$sp(StreamingContext.scala:681) at org.apache.spark.util.Utils$.tryLogNonFatalError(Utils.scala:1357) at org.apache.spark.streaming.StreamingContext.stop(StreamingContext.scala:680) at org.apache.spark.streaming.StreamingContext.org$apache$spark$streaming$StreamingContext$$stopOnShutdown(StreamingContext.scala:714) at org.apache.spark.streaming.StreamingContext$$anonfun$start$1.apply$mcV$sp(StreamingContext.scala:599) at org.apache.spark.util.SparkShutdownHook.run(ShutdownHookManager.scala:216) at org.apache.spark.util.SparkShutdownHookManager$$anonfun$runAll$1$$anonfun$apply$mcV$sp$1.apply$mcV$sp(ShutdownHookManager.scala:188) at org.apache.spark.util.SparkShutdownHookManager$$anonfun$runAll$1$$anonfun$apply$mcV$sp$1.apply(ShutdownHookManager.scala:188) at org.apache.spark.util.SparkShutdownHookManager$$anonfun$runAll$1$$anonfun$apply$mcV$sp$1.apply(ShutdownHookManager.scala:188) at org.apache.spark.util.Utils$.logUncaughtExceptions(Utils.scala:1988) at org.apache.spark.util.SparkShutdownHookManager$$anonfun$runAll$1.apply$mcV$sp(ShutdownHookManager.scala:188) at org.apache.spark.util.SparkShutdownHookManager$$anonfun$runAll$1.apply(ShutdownHookManager.scala:188) at org.apache.spark.util.SparkShutdownHookManager$$anonfun$runAll$1.apply(ShutdownHookManager.scala:188) at scala.util.Try$.apply(Try.scala:192) at org.apache.spark.util.SparkShutdownHookManager.runAll(ShutdownHookManager.scala:188) at org.apache.spark.util.SparkShutdownHookManager$$anon$2.run(ShutdownHookManager.scala:178) at java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:511) at java.util.concurrent.FutureTask.run(FutureTask.java:266) at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1149) at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:624) at java.lang.Thread.run(Thread.java:748) ```

spark2.1.0运行spark on yarn的client模式一定需要自行编译吗?

如题,我用官网下载的预编译版本,每次运行client模式都会报错 ERROR SparkContext: Error initializing SparkContext. org.apache.spark.SparkException: Yarn application has already ended! It might have been killed or unable to launch application master. 但是用cluster模式就能正常运行。网上看了一些~请问有这个说法吗? 采用的是java1.8,hadoop2.7.3,spark2.1.0。 请指导一下,谢谢!

Spark提交作业为什么一定要conf.setJars(),它的具体作用到底是什么?

代码如下: ``` package wordcount import org.apache.spark.SparkContext import org.apache.spark.SparkConf import org.apache.spark.sql.SparkSession import org.apache.spark.rdd.RDD object WordCount extends App { val conf = new SparkConf() //就是这里,为什必须要有它,它的具体作用到底是啥? .set("spark.jars", "src/main/resources/sparkcore.jar,") .set("spark.app.name", "WordCount") .set("spark.master", "spark://master:7077") .set("spark.driver.host", "win") .set("spark.executor.memory", "512M") .set("spark.eventLog.enabled", "true") .set("spark.eventLog.dir", "hdfs://master:9000/spark/history") val sc=new SparkContext(conf) val lines:RDD[String]=sc.textFile("hdfs://master:9000/user/dsf/wordcount_input") val words:RDD[String]=lines.flatMap(_.split(" ")) val wordAndOne:RDD[(String,Int)]=words.map((_,1)) val reduce:RDD[(String,Int)]=wordAndOne.reduceByKey(_+_) val sorted:RDD[(String,Int)]=reduce.sortBy(_._2, ascending=false,numPartitions=1) sorted.saveAsTextFile("hdfs://master:9000/user/dsf/wordcount_output") println("\ntextFile: "+lines.collect().toBuffer) println("flatMap: "+words.collect().toBuffer) println("map: "+wordAndOne.collect().toBuffer) println("reduceByKey: "+reduce.collect().toBuffer) println("sortBy: "+sorted.collect().toBuffer) sc.stop() } /** 在Linux终端运行此应用的命令行: spark-submit \ --master spark://master:7077 \ --class wordcount.WordCount \ sparkcore.jar */ ``` 如果没有.set("spark.jars", "src/main/resources/sparkcore.jar,")这段代码,它会报这个异常: ``` Exception in thread "main" org.apache.spark.SparkException: Job aborted due to stage failure: Task 0 in stage 0.0 failed 4 times, most recent failure: Lost task 0.3 in stage 0.0 (TID 6, 192.168.1.15, executor 0): java.lang.ClassCastException: cannot assign instance of scala.collection.immutable.List$SerializationProxy to field org.apache.spark.rdd.RDD.org$apache$spark$rdd$RDD$$dependencies_ of type scala.collection.Seq in instance of org.apache.spark.rdd.MapPartitionsRDD ``` ![这是我在Spark官网找到的:](https://img-ask.csdn.net/upload/201810/14/1539507421_95226.jpg) 翻译过来是: spark.jars: 以逗号分隔的本地jar列表,包含在驱动程序和执行程序类路径中。 按照官网的意思,是Driver和Excutor都应该有程序的jar包,可我不明白它的具体原理,哪位好心人给讲解一下,谢谢!

执行jar报错 Hadoop java.io.IOException

[img=http://img.bbs.csdn.net/upload/201703/15/1489518401_142809.png][/img] Error: java.io.IOException: Initialization of all the collectors failed. Error in last collector was :interface javax.xml.soap.Text hadoop jar Hadoop_Demo1.jar /user/myData/ /user/out/ 执行简单jar包 17/03/15 02:52:37 INFO client.RMProxy: Connecting to ResourceManager at s0/192.168.253.130:8032 17/03/15 02:52:37 WARN mapreduce.JobResourceUploader: Hadoop command-line option parsing not performed. Implement the Tool interface and execute your application with ToolRunner to remedy this. 17/03/15 02:52:38 INFO input.FileInputFormat: Total input paths to process : 2 17/03/15 02:52:38 INFO mapreduce.JobSubmitter: number of splits:2 17/03/15 02:52:38 INFO mapreduce.JobSubmitter: Submitting tokens for job: job_1489512856623_0004 17/03/15 02:52:39 INFO impl.YarnClientImpl: Submitted application application_1489512856623_0004 17/03/15 02:52:39 INFO mapreduce.Job: The url to track the job: http://s0:8088/proxy/application_1489512856623_0004/ 17/03/15 02:52:39 INFO mapreduce.Job: Running job: job_1489512856623_0004 17/03/15 02:52:50 INFO mapreduce.Job: Job job_1489512856623_0004 running in uber mode : false 17/03/15 02:52:50 INFO mapreduce.Job: map 0% reduce 0% 17/03/15 02:55:18 INFO mapreduce.Job: map 50% reduce 0% 17/03/15 02:55:18 INFO mapreduce.Job: Task Id : attempt_1489512856623_0004_m_000001_0, Status : FAILED Error: java.io.IOException: Initialization of all the collectors failed. Error in last collector was :interface javax.xml.soap.Text at org.apache.hadoop.mapred.MapTask.createSortingCollector(MapTask.java:414) at org.apache.hadoop.mapred.MapTask.access$100(MapTask.java:81) at org.apache.hadoop.mapred.MapTask$NewOutputCollector.<init>(MapTask.java:698) at org.apache.hadoop.mapred.MapTask.runNewMapper(MapTask.java:770) at org.apache.hadoop.mapred.MapTask.run(MapTask.java:341) at org.apache.hadoop.mapred.YarnChild$2.run(YarnChild.java:164) at java.security.AccessController.doPrivileged(Native Method) at javax.security.auth.Subject.doAs(Subject.java:422) at org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1698) at org.apache.hadoop.mapred.YarnChild.main(YarnChild.java:158) Caused by: java.lang.ClassCastException: interface javax.xml.soap.Text at java.lang.Class.asSubclass(Class.java:3404) at org.apache.hadoop.mapred.JobConf.getOutputKeyComparator(JobConf.java:887) at org.apache.hadoop.mapred.MapTask$MapOutputBuffer.init(MapTask.java:1004) at org.apache.hadoop.mapred.MapTask.createSortingCollector(MapTask.java:402) ... 9 more Container killed by the ApplicationMaster. 17/03/15 02:55:18 INFO mapreduce.Job: Task Id : attempt_1489512856623_0004_m_000000_0, Status : FAILED Error: java.io.IOException: Initialization of all the collectors failed. Error in last collector was :interface javax.xml.soap.Text at org.apache.hadoop.mapred.MapTask.createSortingCollector(MapTask.java:414) at org.apache.hadoop.mapred.MapTask.access$100(MapTask.java:81) at org.apache.hadoop.mapred.MapTask$NewOutputCollector.<init>(MapTask.java:698) at org.apache.hadoop.mapred.MapTask.runNewMapper(MapTask.java:770) at org.apache.hadoop.mapred.MapTask.run(MapTask.java:341) at org.apache.hadoop.mapred.YarnChild$2.run(YarnChild.java:164) at java.security.AccessController.doPrivileged(Native Method) at javax.security.auth.Subject.doAs(Subject.java:422) at org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1698) at org.apache.hadoop.mapred.YarnChild.main(YarnChild.java:158) Caused by: java.lang.ClassCastException: interface javax.xml.soap.Text at java.lang.Class.asSubclass(Class.java:3404) at org.apache.hadoop.mapred.JobConf.getOutputKeyComparator(JobConf.java:887) at org.apache.hadoop.mapred.MapTask$MapOutputBuffer.init(MapTask.java:1004) at org.apache.hadoop.mapred.MapTask.createSortingCollector(MapTask.java:402) ... 9 more 17/03/15 02:55:19 INFO mapreduce.Job: map 0% reduce 0% 17/03/15 02:55:31 INFO mapreduce.Job: Task Id : attempt_1489512856623_0004_m_000000_1, Status : FAILED Error: java.io.IOException: Initialization of all the collectors failed. Error in last collector was :interface javax.xml.soap.Text at org.apache.hadoop.mapred.MapTask.createSortingCollector(MapTask.java:414) at org.apache.hadoop.mapred.MapTask.access$100(MapTask.java:81) at org.apache.hadoop.mapred.MapTask$NewOutputCollector.<init>(MapTask.java:698) at org.apache.hadoop.mapred.MapTask.runNewMapper(MapTask.java:770) at org.apache.hadoop.mapred.MapTask.run(MapTask.java:341) at org.apache.hadoop.mapred.YarnChild$2.run(YarnChild.java:164) at java.security.AccessController.doPrivileged(Native Method) at javax.security.auth.Subject.doAs(Subject.java:422) at org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1698) at org.apache.hadoop.mapred.YarnChild.main(YarnChild.java:158)

hadoop使用yarn运行jar 报java.lang.ClassNotFoundException 找不到类 (找不到的不是主类)

1、写了一个数据分析的程序,用idea打成jar包,依赖jar都打进去了 ![图片说明](https://img-ask.csdn.net/upload/201911/03/1572779664_439750.png) 已经设置了 job.setJarByClass(CountDurationRunner.class); 2、开启hadoop zookeeper 和hbase集群 3、yarn运行jar : $ /opt/module/hadoop-2.7.2/bin/yarn jar ct_analysis.jar runner.CountDurationRunner 报错截图:![图片说明](https://img-ask.csdn.net/upload/201911/03/1572779908_781957.png) CountDurationRunner类代码: ``` package runner; import kv.key.ComDimension; //就是这里第一个就没找到 import kv.value.CountDurationValue; import mapper.CountDurationMapper; import org.apache.hadoop.conf.Configuration; import org.apache.hadoop.hbase.HBaseConfiguration; import org.apache.hadoop.hbase.TableName; import org.apache.hadoop.hbase.client.Admin; import org.apache.hadoop.hbase.client.Connection; import org.apache.hadoop.hbase.client.ConnectionFactory; import org.apache.hadoop.hbase.client.Scan; import org.apache.hadoop.hbase.mapreduce.TableMapReduceUtil; import org.apache.hadoop.io.Text; import org.apache.hadoop.mapreduce.Job; import org.apache.hadoop.util.Tool; import org.apache.hadoop.util.ToolRunner; import outputformat.MysqlOutputFormat; import reducer.CountDurationReducer; import java.io.IOException; public class CountDurationRunner implements Tool { private Configuration conf = null; @Override public void setConf(Configuration conf) { this.conf = HBaseConfiguration.create(conf); } @Override public Configuration getConf() { return this.conf; } @Override public int run(String[] args) throws Exception { //得到conf Configuration conf = this.getConf(); //实例化job Job job = Job.getInstance(conf); job.setJarByClass(CountDurationRunner.class); //组装Mapper InputFormat initHbaseInputConfig(job); //组装Reducer outputFormat initHbaseOutputConfig(job); return job.waitForCompletion(true) ? 0 : 1; } private void initHbaseOutputConfig(Job job) { Connection connection = null; Admin admin = null; String tableName = "ns_ct:calllog"; try { connection = ConnectionFactory.createConnection(job.getConfiguration()); admin = connection.getAdmin(); if(!admin.tableExists(TableName.valueOf(tableName))) throw new RuntimeException("没有找到目标表"); Scan scan = new Scan(); //初始化Mapper TableMapReduceUtil.initTableMapperJob( tableName, scan, CountDurationMapper.class, ComDimension.class, Text.class, job, true); }catch (IOException e){ e.printStackTrace(); }finally { try { if(admin!=null) admin.close(); if(connection!=null) connection.close(); } catch (IOException e) { e.printStackTrace(); } } } private void initHbaseInputConfig(Job job) { job.setReducerClass(CountDurationReducer.class); job.setOutputKeyClass(ComDimension.class); job.setOutputValueClass(CountDurationValue.class); job.setOutputFormatClass(MysqlOutputFormat.class); } public static void main(String[] args) { try { int status = ToolRunner.run(new CountDurationRunner(), args); System.exit(status); } catch (Exception e) { e.printStackTrace(); } } } 这问题困扰很久了,有人说classPath不对,不知道如何修改,求助! ```

执行hql的时候,在hadoop中不显示结果,请问是啥原因?

我的hql语句是:select a.s_id,a.s_name,count(b.c_id),sum(b.s_score) from student a join score b on a.s_id = b.s_id group by a.s_id,a.s_name; 以下是程序执行过程中的代码: WARNING: Hive-on-MR is deprecated in Hive 2 and may not be available in the future versions. Consider using a different execution engine (i.e. spark, tez) or using Hive 1.X releases. Query ID = hadoop00_20190707222645_088edcae-7934-42ab-be0d-162c6c3175ed Total jobs = 1 SLF4J: Class path contains multiple SLF4J bindings. SLF4J: Found binding in [jar:file:/home/hadoop00/apps/apache-hive-2.3.4-bin/lib/log4j-slf4j-impl-2.6.2.jar!/org/slf4j/impl/StaticLoggerBinder.class] SLF4J: Found binding in [jar:file:/home/hadoop00/hadoop-2.7.7/share/hadoop/common/lib/slf4j-log4j12-1.7.10.jar!/org/slf4j/impl/StaticLoggerBinder.class] SLF4J: See http://www.slf4j.org/codes.html#multiple_bindings for an explanation. SLF4J: Actual binding is of type [org.apache.logging.slf4j.Log4jLoggerFactory] 2019-07-07 22:27:08 Starting to launch local task to process map join; maximum memory = 518979584 2019-07-07 22:27:13 Dump the side-table for tag: 1 with group count: 7 into file: file:/tmp/hadoop00/13449221-f642-4dc7-a09d-61592e221ee3/hive_2019-07-07_22-26-45_439_2116768033594646754-1/-local-10005/HashTable-Stage-2/MapJoin-mapfile01--.hashtable 2019-07-07 22:27:13 Uploaded 1 File to: file:/tmp/hadoop00/13449221-f642-4dc7-a09d-61592e221ee3/hive_2019-07-07_22-26-45_439_2116768033594646754-1/-local-10005/HashTable-Stage-2/MapJoin-mapfile01--.hashtable (537 bytes) 2019-07-07 22:27:13 End of local task; Time Taken: 4.37 sec. Execution completed successfully MapredLocal task succeeded Launching Job 1 out of 1 Number of reduce tasks not specified. Estimated from input data size: 1 In order to change the average load for a reducer (in bytes): set hive.exec.reducers.bytes.per.reducer=<number> In order to limit the maximum number of reducers: set hive.exec.reducers.max=<number> In order to set a constant number of reducers: set mapreduce.job.reduces=<number> Starting Job = job_1562492270355_0001, Tracking URL = http://hadoop03:8088/proxy/application_1562492270355_0001/ Kill Command = /home/hadoop00/hadoop-2.7.7/bin/hadoop job -kill job_1562492270355_0001 Hadoop job information for Stage-2: number of mappers: 1; number of reducers: 1 2019-07-07 22:29:22,969 Stage-2 map = 0%, reduce = 0% 2019-07-07 22:30:23,579 Stage-2 map = 0%, reduce = 0% 2019-07-07 22:31:24,667 Stage-2 map = 0%, reduce = 0% 2019-07-07 22:32:21,811 Stage-2 map = 100%, reduce = 0%, Cumulative CPU 26.68 sec 2019-07-07 22:33:17,853 Stage-2 map = 100%, reduce = 67%, Cumulative CPU 30.33 sec 2019-07-07 22:33:50,432 Stage-2 map = 100%, reduce = 100%, Cumulative CPU 52.09 sec MapReduce Total cumulative CPU time: 52 seconds 90 msec Ended Job = job_1562492270355_0001 MapReduce Jobs Launched: Stage-Stage-2: Map: 1 Reduce: 1 Cumulative CPU: 52.09 sec HDFS Read: 12916 HDFS Write: 87 SUCCESS Total MapReduce CPU Time Spent: 52 seconds 90 msec OK Time taken: 430.739 seconds 两个表如下: student: s_id s_name s_brith s_sex 01 赵雷 1990-01-01 男 NULL NULL NULL 02 钱电 1990-12-21 男 NULL NULL NULL 03 孙风 1990-05-20 男 NULL NULL NULL 04 李云 1990-08-06 男 NULL NULL NULL 05 周梅 1991-12-01 女 NULL NULL NULL 06 吴兰 1992-03-01 女 NULL NULL NULL 07 郑竹 1989-07-01 女 NULL NULL NULL 08 王菊 1990-01-20 女 NULL NULL NULL score: s_id c_id s_score 01 01 80 01 02 90 01 03 99 02 01 70 02 02 60 02 03 80 03 01 80 03 02 80 03 03 80 04 01 50 04 02 30 04 03 20 05 01 76 05 02 87 06 01 31 06 03 34 07 02 89 07 03 98

大学四年自学走来,这些私藏的实用工具/学习网站我贡献出来了

大学四年,看课本是不可能一直看课本的了,对于学习,特别是自学,善于搜索网上的一些资源来辅助,还是非常有必要的,下面我就把这几年私藏的各种资源,网站贡献出来给你们。主要有:电子书搜索、实用工具、在线视频学习网站、非视频学习网站、软件下载、面试/求职必备网站。 注意:文中提到的所有资源,文末我都给你整理好了,你们只管拿去,如果觉得不错,转发、分享就是最大的支持了。 一、电子书搜索 对于大部分程序员...

在中国程序员是青春饭吗?

今年,我也32了 ,为了不给大家误导,咨询了猎头、圈内好友,以及年过35岁的几位老程序员……舍了老脸去揭人家伤疤……希望能给大家以帮助,记得帮我点赞哦。 目录: 你以为的人生 一次又一次的伤害 猎头界的真相 如何应对互联网行业的「中年危机」 一、你以为的人生 刚入行时,拿着傲人的工资,想着好好干,以为我们的人生是这样的: 等真到了那一天,你会发现,你的人生很可能是这样的: ...

Java基础知识面试题(2020最新版)

文章目录Java概述何为编程什么是Javajdk1.5之后的三大版本JVM、JRE和JDK的关系什么是跨平台性?原理是什么Java语言有哪些特点什么是字节码?采用字节码的最大好处是什么什么是Java程序的主类?应用程序和小程序的主类有何不同?Java应用程序与小程序之间有那些差别?Java和C++的区别Oracle JDK 和 OpenJDK 的对比基础语法数据类型Java有哪些数据类型switc...

我以为我学懂了数据结构,直到看了这个导图才发现,我错了

数据结构与算法思维导图

技术大佬:我去,你写的 switch 语句也太老土了吧

昨天早上通过远程的方式 review 了两名新来同事的代码,大部分代码都写得很漂亮,严谨的同时注释也很到位,这令我非常满意。但当我看到他们当中有一个人写的 switch 语句时,还是忍不住破口大骂:“我擦,小王,你丫写的 switch 语句也太老土了吧!” 来看看小王写的代码吧,看完不要骂我装逼啊。 private static String createPlayer(PlayerTypes p...

和黑客斗争的 6 天!

互联网公司工作,很难避免不和黑客们打交道,我呆过的两家互联网公司,几乎每月每天每分钟都有黑客在公司网站上扫描。有的是寻找 Sql 注入的缺口,有的是寻找线上服务器可能存在的漏洞,大部分都...

Linux 会成为主流桌面操作系统吗?

整理 |屠敏出品 | CSDN(ID:CSDNnews)2020 年 1 月 14 日,微软正式停止了 Windows 7 系统的扩展支持,这意味着服役十年的 Windows 7,属于...

讲一个程序员如何副业月赚三万的真实故事

loonggg读完需要3分钟速读仅需 1 分钟大家好,我是你们的校长。我之前讲过,这年头,只要肯动脑,肯行动,程序员凭借自己的技术,赚钱的方式还是有很多种的。仅仅靠在公司出卖自己的劳动时...

学习总结之HTML5剑指前端(建议收藏,图文并茂)

前言学习《HTML5与CSS3权威指南》这本书很不错,学完之后我颇有感触,觉得web的世界开明了许多。这本书是需要有一定基础的web前端开发工程师。这本书主要学习HTML5和css3,看...

女程序员,为什么比男程序员少???

昨天看到一档综艺节目,讨论了两个话题:(1)中国学生的数学成绩,平均下来看,会比国外好?为什么?(2)男生的数学成绩,平均下来看,会比女生好?为什么?同时,我又联想到了一个技术圈经常讨...

搜狗输入法也在挑战国人的智商!

故事总是一个接着一个到来...上周写完《鲁大师已经彻底沦为一款垃圾流氓软件!》这篇文章之后,鲁大师的市场工作人员就找到了我,希望把这篇文章删除掉。经过一番沟通我先把这篇文章从公号中删除了...

副业收入是我做程序媛的3倍,工作外的B面人生是怎样的?

提到“程序员”,多数人脑海里首先想到的大约是:为人木讷、薪水超高、工作枯燥…… 然而,当离开工作岗位,撕去层层标签,脱下“程序员”这身外套,有的人生动又有趣,马上展现出了完全不同的A/B面人生! 不论是简单的爱好,还是正经的副业,他们都干得同样出色。偶尔,还能和程序员的特质结合,产生奇妙的“化学反应”。 @Charlotte:平日素颜示人,周末美妆博主 大家都以为程序媛也个个不修边幅,但我们也许...

MySQL数据库面试题(2020最新版)

文章目录数据库基础知识为什么要使用数据库什么是SQL?什么是MySQL?数据库三大范式是什么mysql有关权限的表都有哪几个MySQL的binlog有有几种录入格式?分别有什么区别?数据类型mysql有哪些数据类型引擎MySQL存储引擎MyISAM与InnoDB区别MyISAM索引与InnoDB索引的区别?InnoDB引擎的4大特性存储引擎选择索引什么是索引?索引有哪些优缺点?索引使用场景(重点)...

新一代神器STM32CubeMonitor介绍、下载、安装和使用教程

关注、星标公众号,不错过精彩内容作者:黄工公众号:strongerHuang最近ST官网悄悄新上线了一款比较强大的工具:STM32CubeMonitor V1.0.0。经过我研究和使用之...

记一次腾讯面试,我挂在了最熟悉不过的队列上……

腾讯后台面试,面试官问:如何自己实现队列?

如果你是老板,你会不会踢了这样的员工?

有个好朋友ZS,是技术总监,昨天问我:“有一个老下属,跟了我很多年,做事勤勤恳恳,主动性也很好。但随着公司的发展,他的进步速度,跟不上团队的步伐了,有点...

我入职阿里后,才知道原来简历这么写

私下里,有不少读者问我:“二哥,如何才能写出一份专业的技术简历呢?我总感觉自己写的简历太烂了,所以投了无数份,都石沉大海了。”说实话,我自己好多年没有写过简历了,但我认识的一个同行,他在阿里,给我说了一些他当年写简历的方法论,我感觉太牛逼了,实在是忍不住,就分享了出来,希望能够帮助到你。 01、简历的本质 作为简历的撰写者,你必须要搞清楚一点,简历的本质是什么,它就是为了来销售你的价值主张的。往深...

冒泡排序动画(基于python pygame实现)

本项目效果初始截图如下 动画见本人b站投稿:https://www.bilibili.com/video/av95491382 本项目对应github地址:https://github.com/BigShuang python版本:3.6,pygame版本:1.9.3。(python版本一致应该就没什么问题) 样例gif如下 ======================= 大爽歌作,mad

Redis核心原理与应用实践

Redis核心原理与应用实践 在很多场景下都会使用Redis,但是到了深层次的时候就了解的不是那么深刻,以至于在面试的时候经常会遇到卡壳的现象,学习知识要做到系统和深入,不要把Redis想象的过于复杂,和Mysql一样,是个读取数据的软件。 有一个理解是Redis是key value缓存服务器,更多的优点在于对value的操作更加丰富。 安装 yum install redis #yum安装 b...

现代的 “Hello, World”,可不仅仅是几行代码而已

作者 |Charles R. Martin译者 | 弯月,责编 | 夕颜头图 |付费下载自视觉中国出品 | CSDN(ID:CSDNnews)新手...

带了6个月的徒弟当了面试官,而身为高级工程师的我天天修Bug......

即将毕业的应届毕业生一枚,现在只拿到了两家offer,但最近听到一些消息,其中一个offer,我这个组据说客户很少,很有可能整组被裁掉。 想问大家: 如果我刚入职这个组就被裁了怎么办呢? 大家都是什么时候知道自己要被裁了的? 面试软技能指导: BQ/Project/Resume 试听内容: 除了刷题,还有哪些技能是拿到offer不可或缺的要素 如何提升面试软实力:简历, 行为面试,沟通能...

!大部分程序员只会写3年代码

如果世界上都是这种不思进取的软件公司,那别说大部分程序员只会写 3 年代码,恐怕就没有程序员这种职业。

离职半年了,老东家又发 offer,回不回?

有小伙伴问松哥这个问题,他在上海某公司,在离职了几个月后,前公司的领导联系到他,希望他能够返聘回去,他很纠结要不要回去? 俗话说好马不吃回头草,但是这个小伙伴既然感到纠结了,我觉得至少说明了两个问题:1.曾经的公司还不错;2.现在的日子也不是很如意。否则应该就不会纠结了。 老实说,松哥之前也有过类似的经历,今天就来和小伙伴们聊聊回头草到底吃不吃。 首先一个基本观点,就是离职了也没必要和老东家弄的苦...

2020阿里全球数学大赛:3万名高手、4道题、2天2夜未交卷

阿里巴巴全球数学竞赛( Alibaba Global Mathematics Competition)由马云发起,由中国科学技术协会、阿里巴巴基金会、阿里巴巴达摩院共同举办。大赛不设报名门槛,全世界爱好数学的人都可参与,不论是否出身数学专业、是否投身数学研究。 2020年阿里巴巴达摩院邀请北京大学、剑桥大学、浙江大学等高校的顶尖数学教师组建了出题组。中科院院士、美国艺术与科学院院士、北京国际数学...

为什么你不想学习?只想玩?人是如何一步一步废掉的

不知道是不是只有我这样子,还是你们也有过类似的经历。 上学的时候总有很多光辉历史,学年名列前茅,或者单科目大佬,但是虽然慢慢地长大了,你开始懈怠了,开始废掉了。。。 什么?你说不知道具体的情况是怎么样的? 我来告诉你: 你常常潜意识里或者心理觉得,自己真正的生活或者奋斗还没有开始。总是幻想着自己还拥有大把时间,还有无限的可能,自己还能逆风翻盘,只不是自己还没开始罢了,自己以后肯定会变得特别厉害...

HTTP与HTTPS的区别

面试官问HTTP与HTTPS的区别,我这样回答让他竖起大拇指!

程序员毕业去大公司好还是小公司好?

虽然大公司并不是人人都能进,但我仍建议还未毕业的同学,尽力地通过校招向大公司挤,但凡挤进去,你这一生会容易很多。 大公司哪里好?没能进大公司怎么办?答案都在这里了,记得帮我点赞哦。 目录: 技术氛围 内部晋升与跳槽 啥也没学会,公司倒闭了? 不同的人脉圈,注定会有不同的结果 没能去大厂怎么办? 一、技术氛围 纵观整个程序员技术领域,哪个在行业有所名气的大牛,不是在大厂? 而且众所...

男生更看重女生的身材脸蛋,还是思想?

往往,我们看不进去大段大段的逻辑。深刻的哲理,往往短而精悍,一阵见血。问:产品经理挺漂亮的,有点心动,但不知道合不合得来。男生更看重女生的身材脸蛋,还是...

程序员为什么千万不要瞎努力?

本文作者用对比非常鲜明的两个开发团队的故事,讲解了敏捷开发之道 —— 如果你的团队缺乏统一标准的环境,那么即使勤劳努力,不仅会极其耗时而且成果甚微,使用...

为什么程序员做外包会被瞧不起?

二哥,有个事想询问下您的意见,您觉得应届生值得去外包吗?公司虽然挺大的,中xx,但待遇感觉挺低,马上要报到,挺纠结的。

面试阿里p7,被按在地上摩擦,鬼知道我经历了什么?

面试阿里p7被问到的问题(当时我只知道第一个):@Conditional是做什么的?@Conditional多个条件是什么逻辑关系?条件判断在什么时候执...

终于懂了TCP和UDP协议区别

终于懂了TCP和UDP协议区别

无代码时代来临,程序员如何保住饭碗?

编程语言层出不穷,从最初的机器语言到如今2500种以上的高级语言,程序员们大呼“学到头秃”。程序员一边面临编程语言不断推陈出新,一边面临由于许多代码已存在,程序员编写新应用程序时存在重复“搬砖”的现象。 无代码/低代码编程应运而生。无代码/低代码是一种创建应用的方法,它可以让开发者使用最少的编码知识来快速开发应用程序。开发者通过图形界面中,可视化建模来组装和配置应用程序。这样一来,开发者直...

面试了一个 31 岁程序员,让我有所触动,30岁以上的程序员该何去何从?

最近面试了一个31岁8年经验的程序猿,让我有点感慨,大龄程序猿该何去何从。

大三实习生,字节跳动面经分享,已拿Offer

说实话,自己的算法,我一个不会,太难了吧

程序员垃圾简历长什么样?

已经连续五年参加大厂校招、社招的技术面试工作,简历看的不下于万份 这篇文章会用实例告诉你,什么是差的程序员简历! 疫情快要结束了,各个公司也都开始春招了,作为即将红遍大江南北的新晋UP主,那当然要为小伙伴们做点事(手动狗头)。 就在公众号里公开征简历,义务帮大家看,并一一点评。《启舰:春招在即,义务帮大家看看简历吧》 一石激起千层浪,三天收到两百多封简历。 花光了两个星期的所有空闲时...

《经典算法案例》01-08:如何使用质数设计扫雷(Minesweeper)游戏

我们都玩过Windows操作系统中的经典游戏扫雷(Minesweeper),如果把质数当作一颗雷,那么,表格中红色的数字哪些是雷(质数)?您能找出多少个呢?文中用列表的方式罗列了10000以内的自然数、质数(素数),6的倍数等,方便大家观察质数的分布规律及特性,以便对算法求解有指导意义。另外,判断质数是初学算法,理解算法重要性的一个非常好的案例。

《Oracle Java SE编程自学与面试指南》最佳学习路线图(2020最新版)

01、Java入门(Getting Started);02、集成开发环境(IDE);03、项目结构(Eclipse JavaProject);04、类和对象(Classes and Objects);05:词法结构(Lexical Structure);06:数据类型和变量(Data Type and Variables);07:运算符(Operators);08:控制流程语句(Control Flow Statements);

大牛都会用的IDEA调试技巧!!!

导读 前天面试了一个985高校的实习生,问了他平时用什么开发工具,他想也没想的说IDEA,于是我抛砖引玉的问了一下IDEA的调试用过吧,你说说怎么设置断点...

立即提问
相关内容推荐