spark sql 查询hive中的数据,查询结果全部为null 5C

16/08/29 15:32:46 INFO ParseDriver: Parsing command: FROM dim_shop SELECT koubei_id,customer_id,koubei_customer_pid,first_cat_id,second_cat_id,third_cat_id,owner_id,shop_sour
ce,transferred_out where dt = '20160130'
16/08/29 15:32:46 INFO ParseDriver: Parse Completed
16/08/29 15:32:47 INFO MemoryStore: ensureFreeSpace(428968) called with curMem=1497706, maxMem=556038881
16/08/29 15:32:47 INFO MemoryStore: Block broadcast_8 stored as values in memory (estimated size 418.9 KB, free 528.4 MB)
16/08/29 15:32:47 INFO MemoryStore: ensureFreeSpace(46219) called with curMem=1926674, maxMem=556038881
16/08/29 15:32:47 INFO MemoryStore: Block broadcast_8_piece0 stored as bytes in memory (estimated size 45.1 KB, free 528.4 MB)
16/08/29 15:32:47 INFO BlockManagerInfo: Added broadcast_8_piece0 in memory on 10.100.24.30:57113 (size: 45.1 KB, free: 530.1 MB)
16/08/29 15:32:47 INFO SparkContext: Created broadcast 8 from show at TmpAliTradSchema.scala:53
16/08/29 15:32:47 INFO FileInputFormat: Total input paths to process : 1
16/08/29 15:32:47 INFO NetworkTopology: Adding a new node: /default/10.100.24.30:50010
16/08/29 15:32:47 INFO NetworkTopology: Adding a new node: /default/10.100.24.10:50010
16/08/29 15:32:47 INFO NetworkTopology: Adding a new node: /default/10.100.24.29:50010
16/08/29 15:32:47 INFO SparkContext: Starting job: show at TmpAliTradSchema.scala:53
16/08/29 15:32:47 INFO DAGScheduler: Got job 5 (show at TmpAliTradSchema.scala:53) with 1 output partitions
16/08/29 15:32:47 INFO DAGScheduler: Final stage: ResultStage 5(show at TmpAliTradSchema.scala:53)
16/08/29 15:32:47 INFO DAGScheduler: Parents of final stage: List()
16/08/29 15:32:47 INFO DAGScheduler: Missing parents: List()
16/08/29 15:32:47 INFO DAGScheduler: Submitting ResultStage 5 (MapPartitionsRDD[25] at show at TmpAliTradSchema.scala:53), which has no missing parents
16/08/29 15:32:47 INFO MemoryStore: ensureFreeSpace(14504) called with curMem=1972893, maxMem=556038881
16/08/29 15:32:47 INFO MemoryStore: Block broadcast_9 stored as values in memory (estimated size 14.2 KB, free 528.4 MB)
16/08/29 15:32:47 INFO MemoryStore: ensureFreeSpace(5576) called with curMem=1987397, maxMem=556038881
16/08/29 15:32:47 INFO MemoryStore: Block broadcast_9_piece0 stored as bytes in memory (estimated size 5.4 KB, free 528.4 MB)
16/08/29 15:32:47 INFO BlockManagerInfo: Added broadcast_9_piece0 in memory on 10.100.24.30:57113 (size: 5.4 KB, free: 530.1 MB)
16/08/29 15:32:47 INFO SparkContext: Created broadcast 9 from broadcast at DAGScheduler.scala:861
16/08/29 15:32:47 INFO DAGScheduler: Submitting 1 missing tasks from ResultStage 5 (MapPartitionsRDD[25] at show at TmpAliTradSchema.scala:53)
16/08/29 15:32:47 INFO YarnScheduler: Adding task set 5.0 with 1 tasks
16/08/29 15:32:47 INFO TaskSetManager: Starting task 0.0 in stage 5.0 (TID 5, datanode162.hadoop, partition 0,NODE_LOCAL, 2443 bytes)
16/08/29 15:32:47 INFO BlockManagerInfo: Added broadcast_9_piece0 in memory on datanode162.hadoop:38271 (size: 5.4 KB, free: 530.1 MB)
16/08/29 15:32:47 INFO BlockManagerInfo: Added broadcast_8_piece0 in memory on datanode162.hadoop:38271 (size: 45.1 KB, free: 530.1 MB)
16/08/29 15:32:47 INFO DAGScheduler: ResultStage 5 (show at TmpAliTradSchema.scala:53) finished in 0.199 s
16/08/29 15:32:47 INFO TaskSetManager: Finished task 0.0 in stage 5.0 (TID 5) in 202 ms on datanode162.hadoop (1/1)
16/08/29 15:32:47 INFO DAGScheduler: Job 5 finished: show at TmpAliTradSchema.scala:53, took 0.251634 s
16/08/29 15:32:47 INFO YarnScheduler: Removed TaskSet 5.0, whose tasks have all completed, from pool
+---------+-----------+-------------------+------------+-------------+------------+--------+-----------+---------------+
|koubei_id|customer_id|koubei_customer_pid|first_cat_id|second_cat_id|third_cat_id|owner_id|shop_source|transferred_out|
+---------+-----------+-------------------+------------+-------------+------------+--------+-----------+---------------+
|null |null |null |null |null |null |null |null |null |
|null |null |null |null |null |null |null |null |null |
|null |null |null |null |null |null |null |null |null |
|null |null |null |null |null |null |null |null |null |
|null |null |null |null |null |null |null |null |null |
|null |null |null |null |null |null |null |null |null |
|null |null |null |null |null |null |null |null |null |
|null |null |null |null |null |null |null |null |null |
|null |null |null |null |null |null |null |null |null |
|null |null |null |null |null |null |null |null |null |
|null |null |null |null |null |null |null |null |null |
|null |null |null |null |null |null |null |null |null |
|null |null |null |null |null |null |null |null |null |
|null |null |null |null |null |null |null |null |null |
|null |null |null |null |null |null |null |null |null |
|null |null |null |null |null |null |null |null |null |
|null |null |null |null |null |null |null |null |null |
|null |null |null |null |null |null |null |null |null |
|null |null |null |null |null |null |null |null |null |
|null |null |null |null |null |null |null |null |null |
+---------+-----------+-------------------+------------+-------------+------------+--------+-----------+---------------+
only showing top 20 rows

目前安装的spark 是CDH 5.5带的1.5.2版本,只有在hive进行分区,且指定分隔符不为默认的\001才会出现该问题

5个回答

试试下面这个设置

set spark.sql.hive.convertMetastoreParquet=false

通过目前测试来看,应该是CDH5.5版本的一个bug,查询结果为空只有在表分区,且指定列分隔符不为 \001 的时候才会出现

查询结果为空一般都是自己的查询语句有问题

我也遇到了同样的问题,想问下解决没

hive中创建表的时候指定了“ROW FORMAT DELIMITED FIELDS TERMINATED BY。。。”,不指定delimited fields就不会出现这种情况,个人也碰到了这种情况,是这样解决的

Megatron_101
Megatron_101 我用一条sql语句查询,结果前几列字段是有的,后面要查的字段,全是一列一列的null值,网上搜了很多资料也没解决,能请教下你具体是怎么解决的么?
大约 2 年之前 回复
Csdn user default icon
上传中...
上传图片
插入图片
抄袭、复制答案,以达到刷声望分或其他目的的行为,在CSDN问答是严格禁止的,一经发现立刻封号。是时候展现真正的技术了!
其他相关推荐
spark通过jdbc读取hive的表报错,我是在zeppelin里运行的

## 代码: import org.apache.spark.sql.hive.HiveContext val pro = new java.util.Properties() pro.setProperty("user", "****") pro.setProperty("password", "*****") val driverName = "org.apache.hadoop.hive.jdbc.HiveDriver"; Class.forName(driverName); val hiveContext = new HiveContext(sc) val hivetable = hiveContext.read.jdbc("jdbc:hive://*****/default", "*****", pro); ## 错误: import org.apache.spark.sql.hive.HiveContext pro: java.util.Properties = {} res15: Object = null res16: Object = null driverName: String = org.apache.hadoop.hive.jdbc.HiveDriver res17: Class[_] = class org.apache.hadoop.hive.jdbc.HiveDriver warning: there was one deprecation warning; re-run with -deprecation for details hiveContext: org.apache.spark.sql.hive.HiveContext = org.apache.spark.sql.hive.HiveContext@14f9cc13 java.sql.SQLException: Method not supported at org.apache.hadoop.hive.jdbc.HiveResultSetMetaData.isSigned(Unknown Source) at org.apache.spark.sql.execution.datasources.jdbc.JdbcUtils$.getSchema(JdbcUtils.scala:232) at org.apache.spark.sql.execution.datasources.jdbc.JDBCRDD$.resolveTable(JDBCRDD.scala:64) at org.apache.spark.sql.execution.datasources.jdbc.JDBCRelation.<init>(JDBCRelation.scala:113) at org.apache.spark.sql.execution.datasources.jdbc.JdbcRelationProvider.createRelation(JdbcRelationProvider.scala:45) at org.apache.spark.sql.execution.datasources.DataSource.resolveRelation(DataSource.scala:330) at org.apache.spark.sql.DataFrameReader.load(DataFrameReader.scala:152) at org.apache.spark.sql.DataFrameReader.load(DataFrameReader.scala:125) at org.apache.spark.sql.DataFrameReader.jdbc(DataFrameReader.scala:166) ... 46 elided

sparkSql使用insert、create table tablename as select 。。。会报一个错,查了很久都没有查到原因。

我在spark的bin下,使用spark-sql。 可以查阅hive库。 show databases; show tables; select * from tablename; drop table tablename; 但是一旦使用insert、create就会报错。 日志如下,有没有大牛答疑解惑呢? ``` > > > create table bak as select * from dm_bi_org_cfg; 18/10/20 15:31:53 ERROR Utils: Aborting task org.apache.hadoop.hive.ql.metadata.HiveException: java.lang.IllegalArgumentException: No enum constant org.apache.hadoop.io.SequenceFile.CompressionType.block at org.apache.hadoop.hive.ql.io.HiveFileFormatUtils.getHiveRecordWriter(HiveFileFormatUtils.java:249) at org.apache.spark.sql.hive.execution.HiveOutputWriter.<init>(HiveFileFormat.scala:123) at org.apache.spark.sql.hive.execution.HiveFileFormat$$anon$1.newInstance(HiveFileFormat.scala:103) at org.apache.spark.sql.execution.datasources.FileFormatWriter$SingleDirectoryWriteTask.newOutputWriter(FileFormatWriter.scala:305) at org.apache.spark.sql.execution.datasources.FileFormatWriter$SingleDirectoryWriteTask.execute(FileFormatWriter.scala:314) at org.apache.spark.sql.execution.datasources.FileFormatWriter$$anonfun$org$apache$spark$sql$execution$datasources$FileFormatWriter$$executeTask$3.apply(FileFormatWriter.scala:258) at org.apache.spark.sql.execution.datasources.FileFormatWriter$$anonfun$org$apache$spark$sql$execution$datasources$FileFormatWriter$$executeTask$3.apply(FileFormatWriter.scala:256) at org.apache.spark.util.Utils$.tryWithSafeFinallyAndFailureCallbacks(Utils.scala:1375) at org.apache.spark.sql.execution.datasources.FileFormatWriter$.org$apache$spark$sql$execution$datasources$FileFormatWriter$$executeTask(FileFormatWriter.scala:261) at org.apache.spark.sql.execution.datasources.FileFormatWriter$$anonfun$write$1$$anonfun$apply$mcV$sp$1.apply(FileFormatWriter.scala:191) at org.apache.spark.sql.execution.datasources.FileFormatWriter$$anonfun$write$1$$anonfun$apply$mcV$sp$1.apply(FileFormatWriter.scala:190) at org.apache.spark.scheduler.ResultTask.runTask(ResultTask.scala:87) at org.apache.spark.scheduler.Task.run(Task.scala:108) at org.apache.spark.executor.Executor$TaskRunner.run(Executor.scala:335) at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1142) at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:617) at java.lang.Thread.run(Thread.java:748) Caused by: java.lang.IllegalArgumentException: No enum constant org.apache.hadoop.io.SequenceFile.CompressionType.block at java.lang.Enum.valueOf(Enum.java:238) at org.apache.hadoop.io.SequenceFile$CompressionType.valueOf(SequenceFile.java:220) at org.apache.hadoop.hive.ql.io.HiveFileFormatUtils.getHiveRecordWriter(HiveFileFormatUtils.java:242) ... 16 more 18/10/20 15:31:53 WARN FileOutputCommitter: Could not delete hdfs://master.datavip.ezhiyang.com:8020/user/hive/warehouse/bi_dm.db/bak/.hive-staging_hive_2018-10-20_15-31-53_518_4175343188146321007-1/-ext-10000/_temporary/0/_temporary/attempt_20181020153153_0003_m_000000_0 18/10/20 15:31:53 ERROR FileFormatWriter: Job job_20181020153153_0003 aborted. 18/10/20 15:31:53 ERROR Executor: Exception in task 0.0 in stage 3.0 (TID 3) org.apache.spark.SparkException: Task failed while writing rows at org.apache.spark.sql.execution.datasources.FileFormatWriter$.org$apache$spark$sql$execution$datasources$FileFormatWriter$$executeTask(FileFormatWriter.scala:272) at org.apache.spark.sql.execution.datasources.FileFormatWriter$$anonfun$write$1$$anonfun$apply$mcV$sp$1.apply(FileFormatWriter.scala:191) at org.apache.spark.sql.execution.datasources.FileFormatWriter$$anonfun$write$1$$anonfun$apply$mcV$sp$1.apply(FileFormatWriter.scala:190) at org.apache.spark.scheduler.ResultTask.runTask(ResultTask.scala:87) at org.apache.spark.scheduler.Task.run(Task.scala:108) at org.apache.spark.executor.Executor$TaskRunner.run(Executor.scala:335) at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1142) at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:617) at java.lang.Thread.run(Thread.java:748) Caused by: org.apache.hadoop.hive.ql.metadata.HiveException: java.lang.IllegalArgumentException: No enum constant org.apache.hadoop.io.SequenceFile.CompressionType.block at org.apache.hadoop.hive.ql.io.HiveFileFormatUtils.getHiveRecordWriter(HiveFileFormatUtils.java:249) at org.apache.spark.sql.hive.execution.HiveOutputWriter.<init>(HiveFileFormat.scala:123) at org.apache.spark.sql.hive.execution.HiveFileFormat$$anon$1.newInstance(HiveFileFormat.scala:103) at org.apache.spark.sql.execution.datasources.FileFormatWriter$SingleDirectoryWriteTask.newOutputWriter(FileFormatWriter.scala:305) at org.apache.spark.sql.execution.datasources.FileFormatWriter$SingleDirectoryWriteTask.execute(FileFormatWriter.scala:314) at org.apache.spark.sql.execution.datasources.FileFormatWriter$$anonfun$org$apache$spark$sql$execution$datasources$FileFormatWriter$$executeTask$3.apply(FileFormatWriter.scala:258) at org.apache.spark.sql.execution.datasources.FileFormatWriter$$anonfun$org$apache$spark$sql$execution$datasources$FileFormatWriter$$executeTask$3.apply(FileFormatWriter.scala:256) at org.apache.spark.util.Utils$.tryWithSafeFinallyAndFailureCallbacks(Utils.scala:1375) at org.apache.spark.sql.execution.datasources.FileFormatWriter$.org$apache$spark$sql$execution$datasources$FileFormatWriter$$executeTask(FileFormatWriter.scala:261) ... 8 more Caused by: java.lang.IllegalArgumentException: No enum constant org.apache.hadoop.io.SequenceFile.CompressionType.block at java.lang.Enum.valueOf(Enum.java:238) at org.apache.hadoop.io.SequenceFile$CompressionType.valueOf(SequenceFile.java:220) at org.apache.hadoop.hive.ql.io.HiveFileFormatUtils.getHiveRecordWriter(HiveFileFormatUtils.java:242) ... 16 more 18/10/20 15:31:53 WARN TaskSetManager: Lost task 0.0 in stage 3.0 (TID 3, localhost, executor driver): org.apache.spark.SparkException: Task failed while writing rows at org.apache.spark.sql.execution.datasources.FileFormatWriter$.org$apache$spark$sql$execution$datasources$FileFormatWriter$$executeTask(FileFormatWriter.scala:272) at org.apache.spark.sql.execution.datasources.FileFormatWriter$$anonfun$write$1$$anonfun$apply$mcV$sp$1.apply(FileFormatWriter.scala:191) at org.apache.spark.sql.execution.datasources.FileFormatWriter$$anonfun$write$1$$anonfun$apply$mcV$sp$1.apply(FileFormatWriter.scala:190) at org.apache.spark.scheduler.ResultTask.runTask(ResultTask.scala:87) at org.apache.spark.scheduler.Task.run(Task.scala:108) at org.apache.spark.executor.Executor$TaskRunner.run(Executor.scala:335) at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1142) at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:617) at java.lang.Thread.run(Thread.java:748) Caused by: org.apache.hadoop.hive.ql.metadata.HiveException: java.lang.IllegalArgumentException: No enum constant org.apache.hadoop.io.SequenceFile.CompressionType.block at org.apache.hadoop.hive.ql.io.HiveFileFormatUtils.getHiveRecordWriter(HiveFileFormatUtils.java:249) at org.apache.spark.sql.hive.execution.HiveOutputWriter.<init>(HiveFileFormat.scala:123) at org.apache.spark.sql.hive.execution.HiveFileFormat$$anon$1.newInstance(HiveFileFormat.scala:103) at org.apache.spark.sql.execution.datasources.FileFormatWriter$SingleDirectoryWriteTask.newOutputWriter(FileFormatWriter.scala:305) at org.apache.spark.sql.execution.datasources.FileFormatWriter$SingleDirectoryWriteTask.execute(FileFormatWriter.scala:314) at org.apache.spark.sql.execution.datasources.FileFormatWriter$$anonfun$org$apache$spark$sql$execution$datasources$FileFormatWriter$$executeTask$3.apply(FileFormatWriter.scala:258) at org.apache.spark.sql.execution.datasources.FileFormatWriter$$anonfun$org$apache$spark$sql$execution$datasources$FileFormatWriter$$executeTask$3.apply(FileFormatWriter.scala:256) at org.apache.spark.util.Utils$.tryWithSafeFinallyAndFailureCallbacks(Utils.scala:1375) at org.apache.spark.sql.execution.datasources.FileFormatWriter$.org$apache$spark$sql$execution$datasources$FileFormatWriter$$executeTask(FileFormatWriter.scala:261) ... 8 more Caused by: java.lang.IllegalArgumentException: No enum constant org.apache.hadoop.io.SequenceFile.CompressionType.block at java.lang.Enum.valueOf(Enum.java:238) at org.apache.hadoop.io.SequenceFile$CompressionType.valueOf(SequenceFile.java:220) at org.apache.hadoop.hive.ql.io.HiveFileFormatUtils.getHiveRecordWriter(HiveFileFormatUtils.java:242) ... 16 more 18/10/20 15:31:53 ERROR TaskSetManager: Task 0 in stage 3.0 failed 1 times; aborting job 18/10/20 15:31:53 ERROR FileFormatWriter: Aborting job null. org.apache.spark.SparkException: Job aborted due to stage failure: Task 0 in stage 3.0 failed 1 times, most recent failure: Lost task 0.0 in stage 3.0 (TID 3, localhost, executor driver): org.apache.spark.SparkException: Task failed while writing rows at org.apache.spark.sql.execution.datasources.FileFormatWriter$.org$apache$spark$sql$execution$datasources$FileFormatWriter$$executeTask(FileFormatWriter.scala:272) at org.apache.spark.sql.execution.datasources.FileFormatWriter$$anonfun$write$1$$anonfun$apply$mcV$sp$1.apply(FileFormatWriter.scala:191) at org.apache.spark.sql.execution.datasources.FileFormatWriter$$anonfun$write$1$$anonfun$apply$mcV$sp$1.apply(FileFormatWriter.scala:190) at org.apache.spark.scheduler.ResultTask.runTask(ResultTask.scala:87) at org.apache.spark.scheduler.Task.run(Task.scala:108) at org.apache.spark.executor.Executor$TaskRunner.run(Executor.scala:335) at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1142) at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:617) at java.lang.Thread.run(Thread.java:748) Caused by: org.apache.hadoop.hive.ql.metadata.HiveException: java.lang.IllegalArgumentException: No enum constant org.apache.hadoop.io.SequenceFile.CompressionType.block at org.apache.hadoop.hive.ql.io.HiveFileFormatUtils.getHiveRecordWriter(HiveFileFormatUtils.java:249) at org.apache.spark.sql.hive.execution.HiveOutputWriter.<init>(HiveFileFormat.scala:123) at org.apache.spark.sql.hive.execution.HiveFileFormat$$anon$1.newInstance(HiveFileFormat.scala:103) at org.apache.spark.sql.execution.datasources.FileFormatWriter$SingleDirectoryWriteTask.newOutputWriter(FileFormatWriter.scala:305) at org.apache.spark.sql.execution.datasources.FileFormatWriter$SingleDirectoryWriteTask.execute(FileFormatWriter.scala:314) at org.apache.spark.sql.execution.datasources.FileFormatWriter$$anonfun$org$apache$spark$sql$execution$datasources$FileFormatWriter$$executeTask$3.apply(FileFormatWriter.scala:258) at org.apache.spark.sql.execution.datasources.FileFormatWriter$$anonfun$org$apache$spark$sql$execution$datasources$FileFormatWriter$$executeTask$3.apply(FileFormatWriter.scala:256) at org.apache.spark.util.Utils$.tryWithSafeFinallyAndFailureCallbacks(Utils.scala:1375) at org.apache.spark.sql.execution.datasources.FileFormatWriter$.org$apache$spark$sql$execution$datasources$FileFormatWriter$$executeTask(FileFormatWriter.scala:261) ... 8 more Caused by: java.lang.IllegalArgumentException: No enum constant org.apache.hadoop.io.SequenceFile.CompressionType.block at java.lang.Enum.valueOf(Enum.java:238) at org.apache.hadoop.io.SequenceFile$CompressionType.valueOf(SequenceFile.java:220) at org.apache.hadoop.hive.ql.io.HiveFileFormatUtils.getHiveRecordWriter(HiveFileFormatUtils.java:242) ... 16 more Driver stacktrace: at org.apache.spark.scheduler.DAGScheduler.org$apache$spark$scheduler$DAGScheduler$$failJobAndIndependentStages(DAGScheduler.scala:1499) at org.apache.spark.scheduler.DAGScheduler$$anonfun$abortStage$1.apply(DAGScheduler.scala:1487) at org.apache.spark.scheduler.DAGScheduler$$anonfun$abortStage$1.apply(DAGScheduler.scala:1486) at scala.collection.mutable.ResizableArray$class.foreach(ResizableArray.scala:59) at scala.collection.mutable.ArrayBuffer.foreach(ArrayBuffer.scala:48) at org.apache.spark.scheduler.DAGScheduler.abortStage(DAGScheduler.scala:1486) at org.apache.spark.scheduler.DAGScheduler$$anonfun$handleTaskSetFailed$1.apply(DAGScheduler.scala:814) at org.apache.spark.scheduler.DAGScheduler$$anonfun$handleTaskSetFailed$1.apply(DAGScheduler.scala:814) at scala.Option.foreach(Option.scala:257) at org.apache.spark.scheduler.DAGScheduler.handleTaskSetFailed(DAGScheduler.scala:814) at org.apache.spark.scheduler.DAGSchedulerEventProcessLoop.doOnReceive(DAGScheduler.scala:1714) at org.apache.spark.scheduler.DAGSchedulerEventProcessLoop.onReceive(DAGScheduler.scala:1669) at org.apache.spark.scheduler.DAGSchedulerEventProcessLoop.onReceive(DAGScheduler.scala:1658) at org.apache.spark.util.EventLoop$$anon$1.run(EventLoop.scala:48) at org.apache.spark.scheduler.DAGScheduler.runJob(DAGScheduler.scala:630) at org.apache.spark.SparkContext.runJob(SparkContext.scala:2022) at org.apache.spark.sql.execution.datasources.FileFormatWriter$$anonfun$write$1.apply$mcV$sp(FileFormatWriter.scala:188) at org.apache.spark.sql.execution.datasources.FileFormatWriter$$anonfun$write$1.apply(FileFormatWriter.scala:173) at org.apache.spark.sql.execution.datasources.FileFormatWriter$$anonfun$write$1.apply(FileFormatWriter.scala:173) at org.apache.spark.sql.execution.SQLExecution$.withNewExecutionId(SQLExecution.scala:65) at org.apache.spark.sql.execution.datasources.FileFormatWriter$.write(FileFormatWriter.scala:173) at org.apache.spark.sql.hive.execution.InsertIntoHiveTable.run(InsertIntoHiveTable.scala:317) at org.apache.spark.sql.execution.command.ExecutedCommandExec.sideEffectResult$lzycompute(commands.scala:58) at org.apache.spark.sql.execution.command.ExecutedCommandExec.sideEffectResult(commands.scala:56) at org.apache.spark.sql.execution.command.ExecutedCommandExec.doExecute(commands.scala:74) at org.apache.spark.sql.execution.SparkPlan$$anonfun$execute$1.apply(SparkPlan.scala:117) at org.apache.spark.sql.execution.SparkPlan$$anonfun$execute$1.apply(SparkPlan.scala:117) at org.apache.spark.sql.execution.SparkPlan$$anonfun$executeQuery$1.apply(SparkPlan.scala:138) at org.apache.spark.rdd.RDDOperationScope$.withScope(RDDOperationScope.scala:151) at org.apache.spark.sql.execution.SparkPlan.executeQuery(SparkPlan.scala:135) at org.apache.spark.sql.execution.SparkPlan.execute(SparkPlan.scala:116) at org.apache.spark.sql.execution.QueryExecution.toRdd$lzycompute(QueryExecution.scala:92) at org.apache.spark.sql.execution.QueryExecution.toRdd(QueryExecution.scala:92) at org.apache.spark.sql.hive.execution.CreateHiveTableAsSelectCommand.run(CreateHiveTableAsSelectCommand.scala:81) at org.apache.spark.sql.execution.command.ExecutedCommandExec.sideEffectResult$lzycompute(commands.scala:58) at org.apache.spark.sql.execution.command.ExecutedCommandExec.sideEffectResult(commands.scala:56) at org.apache.spark.sql.execution.command.ExecutedCommandExec.executeCollect(commands.scala:67) at org.apache.spark.sql.Dataset.<init>(Dataset.scala:182) at org.apache.spark.sql.Dataset$.ofRows(Dataset.scala:67) at org.apache.spark.sql.SparkSession.sql(SparkSession.scala:623) at org.apache.spark.sql.SQLContext.sql(SQLContext.scala:691) at org.apache.spark.sql.hive.thriftserver.SparkSQLDriver.run(SparkSQLDriver.scala:62) at org.apache.spark.sql.hive.thriftserver.SparkSQLCLIDriver.processCmd(SparkSQLCLIDriver.scala:340) at org.apache.hadoop.hive.cli.CliDriver.processLine(CliDriver.java:376) at org.apache.spark.sql.hive.thriftserver.SparkSQLCLIDriver$.main(SparkSQLCLIDriver.scala:248) at org.apache.spark.sql.hive.thriftserver.SparkSQLCLIDriver.main(SparkSQLCLIDriver.scala) at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method) at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62) at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43) at java.lang.reflect.Method.invoke(Method.java:498) at org.apache.spark.deploy.SparkSubmit$.org$apache$spark$deploy$SparkSubmit$$runMain(SparkSubmit.scala:755) at org.apache.spark.deploy.SparkSubmit$.doRunMain$1(SparkSubmit.scala:180) at org.apache.spark.deploy.SparkSubmit$.submit(SparkSubmit.scala:205) at org.apache.spark.deploy.SparkSubmit$.main(SparkSubmit.scala:119) at org.apache.spark.deploy.SparkSubmit.main(SparkSubmit.scala) Caused by: org.apache.spark.SparkException: Task failed while writing rows at org.apache.spark.sql.execution.datasources.FileFormatWriter$.org$apache$spark$sql$execution$datasources$FileFormatWriter$$executeTask(FileFormatWriter.scala:272) at org.apache.spark.sql.execution.datasources.FileFormatWriter$$anonfun$write$1$$anonfun$apply$mcV$sp$1.apply(FileFormatWriter.scala:191) at org.apache.spark.sql.execution.datasources.FileFormatWriter$$anonfun$write$1$$anonfun$apply$mcV$sp$1.apply(FileFormatWriter.scala:190) at org.apache.spark.scheduler.ResultTask.runTask(ResultTask.scala:87) at org.apache.spark.scheduler.Task.run(Task.scala:108) at org.apache.spark.executor.Executor$TaskRunner.run(Executor.scala:335) at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1142) at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:617) at java.lang.Thread.run(Thread.java:748) Caused by: org.apache.hadoop.hive.ql.metadata.HiveException: java.lang.IllegalArgumentException: No enum constant org.apache.hadoop.io.SequenceFile.CompressionType.block at org.apache.hadoop.hive.ql.io.HiveFileFormatUtils.getHiveRecordWriter(HiveFileFormatUtils.java:249) at org.apache.spark.sql.hive.execution.HiveOutputWriter.<init>(HiveFileFormat.scala:123) at org.apache.spark.sql.hive.execution.HiveFileFormat$$anon$1.newInstance(HiveFileFormat.scala:103) at org.apache.spark.sql.execution.datasources.FileFormatWriter$SingleDirectoryWriteTask.newOutputWriter(FileFormatWriter.scala:305) at org.apache.spark.sql.execution.datasources.FileFormatWriter$SingleDirectoryWriteTask.execute(FileFormatWriter.scala:314) at org.apache.spark.sql.execution.datasources.FileFormatWriter$$anonfun$org$apache$spark$sql$execution$datasources$FileFormatWriter$$executeTask$3.apply(FileFormatWriter.scala:258) at org.apache.spark.sql.execution.datasources.FileFormatWriter$$anonfun$org$apache$spark$sql$execution$datasources$FileFormatWriter$$executeTask$3.apply(FileFormatWriter.scala:256) at org.apache.spark.util.Utils$.tryWithSafeFinallyAndFailureCallbacks(Utils.scala:1375) at org.apache.spark.sql.execution.datasources.FileFormatWriter$.org$apache$spark$sql$execution$datasources$FileFormatWriter$$executeTask(FileFormatWriter.scala:261) ... 8 more Caused by: java.lang.IllegalArgumentException: No enum constant org.apache.hadoop.io.SequenceFile.CompressionType.block at java.lang.Enum.valueOf(Enum.java:238) at org.apache.hadoop.io.SequenceFile$CompressionType.valueOf(SequenceFile.java:220) at org.apache.hadoop.hive.ql.io.HiveFileFormatUtils.getHiveRecordWriter(HiveFileFormatUtils.java:242) ... 16 more 18/10/20 15:31:53 ERROR SparkSQLDriver: Failed in [create table bak as select * from dm_bi_org_cfg] org.apache.spark.SparkException: Job aborted. at org.apache.spark.sql.execution.datasources.FileFormatWriter$$anonfun$write$1.apply$mcV$sp(FileFormatWriter.scala:215) at org.apache.spark.sql.execution.datasources.FileFormatWriter$$anonfun$write$1.apply(FileFormatWriter.scala:173) at org.apache.spark.sql.execution.datasources.FileFormatWriter$$anonfun$write$1.apply(FileFormatWriter.scala:173) at org.apache.spark.sql.execution.SQLExecution$.withNewExecutionId(SQLExecution.scala:65) at org.apache.spark.sql.execution.datasources.FileFormatWriter$.write(FileFormatWriter.scala:173) at org.apache.spark.sql.hive.execution.InsertIntoHiveTable.run(InsertIntoHiveTable.scala:317) at org.apache.spark.sql.execution.command.ExecutedCommandExec.sideEffectResult$lzycompute(commands.scala:58) at org.apache.spark.sql.execution.command.ExecutedCommandExec.sideEffectResult(commands.scala:56) at org.apache.spark.sql.execution.command.ExecutedCommandExec.doExecute(commands.scala:74) at org.apache.spark.sql.execution.SparkPlan$$anonfun$execute$1.apply(SparkPlan.scala:117) at org.apache.spark.sql.execution.SparkPlan$$anonfun$execute$1.apply(SparkPlan.scala:117) at org.apache.spark.sql.execution.SparkPlan$$anonfun$executeQuery$1.apply(SparkPlan.scala:138) at org.apache.spark.rdd.RDDOperationScope$.withScope(RDDOperationScope.scala:151) at org.apache.spark.sql.execution.SparkPlan.executeQuery(SparkPlan.scala:135) at org.apache.spark.sql.execution.SparkPlan.execute(SparkPlan.scala:116) at org.apache.spark.sql.execution.QueryExecution.toRdd$lzycompute(QueryExecution.scala:92) at org.apache.spark.sql.execution.QueryExecution.toRdd(QueryExecution.scala:92) at org.apache.spark.sql.hive.execution.CreateHiveTableAsSelectCommand.run(CreateHiveTableAsSelectCommand.scala:81) at org.apache.spark.sql.execution.command.ExecutedCommandExec.sideEffectResult$lzycompute(commands.scala:58) at org.apache.spark.sql.execution.command.ExecutedCommandExec.sideEffectResult(commands.scala:56) at org.apache.spark.sql.execution.command.ExecutedCommandExec.executeCollect(commands.scala:67) at org.apache.spark.sql.Dataset.<init>(Dataset.scala:182) at org.apache.spark.sql.Dataset$.ofRows(Dataset.scala:67) at org.apache.spark.sql.SparkSession.sql(SparkSession.scala:623) at org.apache.spark.sql.SQLContext.sql(SQLContext.scala:691) at org.apache.spark.sql.hive.thriftserver.SparkSQLDriver.run(SparkSQLDriver.scala:62) at org.apache.spark.sql.hive.thriftserver.SparkSQLCLIDriver.processCmd(SparkSQLCLIDriver.scala:340) at org.apache.hadoop.hive.cli.CliDriver.processLine(CliDriver.java:376) at org.apache.spark.sql.hive.thriftserver.SparkSQLCLIDriver$.main(SparkSQLCLIDriver.scala:248) at org.apache.spark.sql.hive.thriftserver.SparkSQLCLIDriver.main(SparkSQLCLIDriver.scala) at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method) at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62) at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43) at java.lang.reflect.Method.invoke(Method.java:498) at org.apache.spark.deploy.SparkSubmit$.org$apache$spark$deploy$SparkSubmit$$runMain(SparkSubmit.scala:755) at org.apache.spark.deploy.SparkSubmit$.doRunMain$1(SparkSubmit.scala:180) at org.apache.spark.deploy.SparkSubmit$.submit(SparkSubmit.scala:205) at org.apache.spark.deploy.SparkSubmit$.main(SparkSubmit.scala:119) at org.apache.spark.deploy.SparkSubmit.main(SparkSubmit.scala) Caused by: org.apache.spark.SparkException: Job aborted due to stage failure: Task 0 in stage 3.0 failed 1 times, most recent failure: Lost task 0.0 in stage 3.0 (TID 3, localhost, executor driver): org.apache.spark.SparkException: Task failed while writing rows at org.apache.spark.sql.execution.datasources.FileFormatWriter$.org$apache$spark$sql$execution$datasources$FileFormatWriter$$executeTask(FileFormatWriter.scala:272) at org.apache.spark.sql.execution.datasources.FileFormatWriter$$anonfun$write$1$$anonfun$apply$mcV$sp$1.apply(FileFormatWriter.scala:191) at org.apache.spark.sql.execution.datasources.FileFormatWriter$$anonfun$write$1$$anonfun$apply$mcV$sp$1.apply(FileFormatWriter.scala:190) at org.apache.spark.scheduler.ResultTask.runTask(ResultTask.scala:87) at org.apache.spark.scheduler.Task.run(Task.scala:108) at org.apache.spark.executor.Executor$TaskRunner.run(Executor.scala:335) at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1142) at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:617) at java.lang.Thread.run(Thread.java:748) Caused by: org.apache.hadoop.hive.ql.metadata.HiveException: java.lang.IllegalArgumentException: No enum constant org.apache.hadoop.io.SequenceFile.CompressionType.block at org.apache.hadoop.hive.ql.io.HiveFileFormatUtils.getHiveRecordWriter(HiveFileFormatUtils.java:249) at org.apache.spark.sql.hive.execution.HiveOutputWriter.<init>(HiveFileFormat.scala:123) at org.apache.spark.sql.hive.execution.HiveFileFormat$$anon$1.newInstance(HiveFileFormat.scala:103) at org.apache.spark.sql.execution.datasources.FileFormatWriter$SingleDirectoryWriteTask.newOutputWriter(FileFormatWriter.scala:305) at org.apache.spark.sql.execution.datasources.FileFormatWriter$SingleDirectoryWriteTask.execute(FileFormatWriter.scala:314) at org.apache.spark.sql.execution.datasources.FileFormatWriter$$anonfun$org$apache$spark$sql$execution$datasources$FileFormatWriter$$executeTask$3.apply(FileFormatWriter.scala:258) at org.apache.spark.sql.execution.datasources.FileFormatWriter$$anonfun$org$apache$spark$sql$execution$datasources$FileFormatWriter$$executeTask$3.apply(FileFormatWriter.scala:256) at org.apache.spark.util.Utils$.tryWithSafeFinallyAndFailureCallbacks(Utils.scala:1375) at org.apache.spark.sql.execution.datasources.FileFormatWriter$.org$apache$spark$sql$execution$datasources$FileFormatWriter$$executeTask(FileFormatWriter.scala:261) ... 8 more Caused by: java.lang.IllegalArgumentException: No enum constant org.apache.hadoop.io.SequenceFile.CompressionType.block at java.lang.Enum.valueOf(Enum.java:238) at org.apache.hadoop.io.SequenceFile$CompressionType.valueOf(SequenceFile.java:220) at org.apache.hadoop.hive.ql.io.HiveFileFormatUtils.getHiveRecordWriter(HiveFileFormatUtils.java:242) ... 16 more Driver stacktrace: at org.apache.spark.scheduler.DAGScheduler.org$apache$spark$scheduler$DAGScheduler$$failJobAndIndependentStages(DAGScheduler.scala:1499) at org.apache.spark.scheduler.DAGScheduler$$anonfun$abortStage$1.apply(DAGScheduler.scala:1487) at org.apache.spark.scheduler.DAGScheduler$$anonfun$abortStage$1.apply(DAGScheduler.scala:1486) at scala.collection.mutable.ResizableArray$class.foreach(ResizableArray.scala:59) at scala.collection.mutable.ArrayBuffer.foreach(ArrayBuffer.scala:48) at org.apache.spark.scheduler.DAGScheduler.abortStage(DAGScheduler.scala:1486) at org.apache.spark.scheduler.DAGScheduler$$anonfun$handleTaskSetFailed$1.apply(DAGScheduler.scala:814) at org.apache.spark.scheduler.DAGScheduler$$anonfun$handleTaskSetFailed$1.apply(DAGScheduler.scala:814) at scala.Option.foreach(Option.scala:257) at org.apache.spark.scheduler.DAGScheduler.handleTaskSetFailed(DAGScheduler.scala:814) at org.apache.spark.scheduler.DAGSchedulerEventProcessLoop.doOnReceive(DAGScheduler.scala:1714) at org.apache.spark.scheduler.DAGSchedulerEventProcessLoop.onReceive(DAGScheduler.scala:1669) at org.apache.spark.scheduler.DAGSchedulerEventProcessLoop.onReceive(DAGScheduler.scala:1658) at org.apache.spark.util.EventLoop$$anon$1.run(EventLoop.scala:48) at org.apache.spark.scheduler.DAGScheduler.runJob(DAGScheduler.scala:630) at org.apache.spark.SparkContext.runJob(SparkContext.scala:2022) at org.apache.spark.sql.execution.datasources.FileFormatWriter$$anonfun$write$1.apply$mcV$sp(FileFormatWriter.scala:188) ... 38 more Caused by: org.apache.spark.SparkException: Task failed while writing rows at org.apache.spark.sql.execution.datasources.FileFormatWriter$.org$apache$spark$sql$execution$datasources$FileFormatWriter$$executeTask(FileFormatWriter.scala:272) at org.apache.spark.sql.execution.datasources.FileFormatWriter$$anonfun$write$1$$anonfun$apply$mcV$sp$1.apply(FileFormatWriter.scala:191) at org.apache.spark.sql.execution.datasources.FileFormatWriter$$anonfun$write$1$$anonfun$apply$mcV$sp$1.apply(FileFormatWriter.scala:190) at org.apache.spark.scheduler.ResultTask.runTask(ResultTask.scala:87) at org.apache.spark.scheduler.Task.run(Task.scala:108) at org.apache.spark.executor.Executor$TaskRunner.run(Executor.scala:335) at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1142) at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:617) at java.lang.Thread.run(Thread.java:748) Caused by: org.apache.hadoop.hive.ql.metadata.HiveException: java.lang.IllegalArgumentException: No enum constant org.apache.hadoop.io.SequenceFile.CompressionType.block at org.apache.hadoop.hive.ql.io.HiveFileFormatUtils.getHiveRecordWriter(HiveFileFormatUtils.java:249) at org.apache.spark.sql.hive.execution.HiveOutputWriter.<init>(HiveFileFormat.scala:123) at org.apache.spark.sql.hive.execution.HiveFileFormat$$anon$1.newInstance(HiveFileFormat.scala:103) at org.apache.spark.sql.execution.datasources.FileFormatWriter$SingleDirectoryWriteTask.newOutputWriter(FileFormatWriter.scala:305) at org.apache.spark.sql.execution.datasources.FileFormatWriter$SingleDirectoryWriteTask.execute(FileFormatWriter.scala:314) at org.apache.spark.sql.execution.datasources.FileFormatWriter$$anonfun$org$apache$spark$sql$execution$datasources$FileFormatWriter$$executeTask$3.apply(FileFormatWriter.scala:258) at org.apache.spark.sql.execution.datasources.FileFormatWriter$$anonfun$org$apache$spark$sql$execution$datasources$FileFormatWriter$$executeTask$3.apply(FileFormatWriter.scala:256) at org.apache.spark.util.Utils$.tryWithSafeFinallyAndFailureCallbacks(Utils.scala:1375) at org.apache.spark.sql.execution.datasources.FileFormatWriter$.org$apache$spark$sql$execution$datasources$FileFormatWriter$$executeTask(FileFormatWriter.scala:261) ... 8 more Caused by: java.lang.IllegalArgumentException: No enum constant org.apache.hadoop.io.SequenceFile.CompressionType.block at java.lang.Enum.valueOf(Enum.java:238) at org.apache.hadoop.io.SequenceFile$CompressionType.valueOf(SequenceFile.java:220) at org.apache.hadoop.hive.ql.io.HiveFileFormatUtils.getHiveRecordWriter(HiveFileFormatUtils.java:242) ... 16 more org.apache.spark.SparkException: Job aborted. at org.apache.spark.sql.execution.datasources.FileFormatWriter$$anonfun$write$1.apply$mcV$sp(FileFormatWriter.scala:215) at org.apache.spark.sql.execution.datasources.FileFormatWriter$$anonfun$write$1.apply(FileFormatWriter.scala:173) at org.apache.spark.sql.execution.datasources.FileFormatWriter$$anonfun$write$1.apply(FileFormatWriter.scala:173) at org.apache.spark.sql.execution.SQLExecution$.withNewExecutionId(SQLExecution.scala:65) at org.apache.spark.sql.execution.datasources.FileFormatWriter$.write(FileFormatWriter.scala:173) at org.apache.spark.sql.hive.execution.InsertIntoHiveTable.run(InsertIntoHiveTable.scala:317) at org.apache.spark.sql.execution.command.ExecutedCommandExec.sideEffectResult$lzycompute(commands.scala:58) at org.apache.spark.sql.execution.command.ExecutedCommandExec.sideEffectResult(commands.scala:56) at org.apache.spark.sql.execution.command.ExecutedCommandExec.doExecute(commands.scala:74) at org.apache.spark.sql.execution.SparkPlan$$anonfun$execute$1.apply(SparkPlan.scala:117) at org.apache.spark.sql.execution.SparkPlan$$anonfun$execute$1.apply(SparkPlan.scala:117) at org.apache.spark.sql.execution.SparkPlan$$anonfun$executeQuery$1.apply(SparkPlan.scala:138) at org.apache.spark.rdd.RDDOperationScope$.withScope(RDDOperationScope.scala:151) at org.apache.spark.sql.execution.SparkPlan.executeQuery(SparkPlan.scala:135) at org.apache.spark.sql.execution.SparkPlan.execute(SparkPlan.scala:116) at org.apache.spark.sql.execution.QueryExecution.toRdd$lzycompute(QueryExecution.scala:92) at org.apache.spark.sql.execution.QueryExecution.toRdd(QueryExecution.scala:92) at org.apache.spark.sql.hive.execution.CreateHiveTableAsSelectCommand.run(CreateHiveTableAsSelectCommand.scala:81) at org.apache.spark.sql.execution.command.ExecutedCommandExec.sideEffectResult$lzycompute(commands.scala:58) at org.apache.spark.sql.execution.command.ExecutedCommandExec.sideEffectResult(commands.scala:56) at org.apache.spark.sql.execution.command.ExecutedCommandExec.executeCollect(commands.scala:67) at org.apache.spark.sql.Dataset.<init>(Dataset.scala:182) at org.apache.spark.sql.Dataset$.ofRows(Dataset.scala:67) at org.apache.spark.sql.SparkSession.sql(SparkSession.scala:623) at org.apache.spark.sql.SQLContext.sql(SQLContext.scala:691) at org.apache.spark.sql.hive.thriftserver.SparkSQLDriver.run(SparkSQLDriver.scala:62) at org.apache.spark.sql.hive.thriftserver.SparkSQLCLIDriver.processCmd(SparkSQLCLIDriver.scala:340) at org.apache.hadoop.hive.cli.CliDriver.processLine(CliDriver.java:376) at org.apache.spark.sql.hive.thriftserver.SparkSQLCLIDriver$.main(SparkSQLCLIDriver.scala:248) at org.apache.spark.sql.hive.thriftserver.SparkSQLCLIDriver.main(SparkSQLCLIDriver.scala) at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method) at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62) at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43) at java.lang.reflect.Method.invoke(Method.java:498) at org.apache.spark.deploy.SparkSubmit$.org$apache$spark$deploy$SparkSubmit$$runMain(SparkSubmit.scala:755) at org.apache.spark.deploy.SparkSubmit$.doRunMain$1(SparkSubmit.scala:180) at org.apache.spark.deploy.SparkSubmit$.submit(SparkSubmit.scala:205) at org.apache.spark.deploy.SparkSubmit$.main(SparkSubmit.scala:119) at org.apache.spark.deploy.SparkSubmit.main(SparkSubmit.scala) Caused by: org.apache.spark.SparkException: Job aborted due to stage failure: Task 0 in stage 3.0 failed 1 times, most recent failure: Lost task 0.0 in stage 3.0 (TID 3, localhost, executor driver): org.apache.spark.SparkException: Task failed while writing rows at org.apache.spark.sql.execution.datasources.FileFormatWriter$.org$apache$spark$sql$execution$datasources$FileFormatWriter$$executeTask(FileFormatWriter.scala:272) at org.apache.spark.sql.execution.datasources.FileFormatWriter$$anonfun$write$1$$anonfun$apply$mcV$sp$1.apply(FileFormatWriter.scala:191) at org.apache.spark.sql.execution.datasources.FileFormatWriter$$anonfun$write$1$$anonfun$apply$mcV$sp$1.apply(FileFormatWriter.scala:190) at org.apache.spark.scheduler.ResultTask.runTask(ResultTask.scala:87) at org.apache.spark.scheduler.Task.run(Task.scala:108) at org.apache.spark.executor.Executor$TaskRunner.run(Executor.scala:335) at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1142) at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:617) at java.lang.Thread.run(Thread.java:748) Caused by: org.apache.hadoop.hive.ql.metadata.HiveException: java.lang.IllegalArgumentException: No enum constant org.apache.hadoop.io.SequenceFile.CompressionType.block at org.apache.hadoop.hive.ql.io.HiveFileFormatUtils.getHiveRecordWriter(HiveFileFormatUtils.java:249) at org.apache.spark.sql.hive.execution.HiveOutputWriter.<init>(HiveFileFormat.scala:123) at org.apache.spark.sql.hive.execution.HiveFileFormat$$anon$1.newInstance(HiveFileFormat.scala:103) at org.apache.spark.sql.execution.datasources.FileFormatWriter$SingleDirectoryWriteTask.newOutputWriter(FileFormatWriter.scala:305) at org.apache.spark.sql.execution.datasources.FileFormatWriter$SingleDirectoryWriteTask.execute(FileFormatWriter.scala:314) at org.apache.spark.sql.execution.datasources.FileFormatWriter$$anonfun$org$apache$spark$sql$execution$datasources$FileFormatWriter$$executeTask$3.apply(FileFormatWriter.scala:258) at org.apache.spark.sql.execution.datasources.FileFormatWriter$$anonfun$org$apache$spark$sql$execution$datasources$FileFormatWriter$$executeTask$3.apply(FileFormatWriter.scala:256) at org.apache.spark.util.Utils$.tryWithSafeFinallyAndFailureCallbacks(Utils.scala:1375) at org.apache.spark.sql.execution.datasources.FileFormatWriter$.org$apache$spark$sql$execution$datasources$FileFormatWriter$$executeTask(FileFormatWriter.scala:261) ... 8 more Caused by: java.lang.IllegalArgumentException: No enum constant org.apache.hadoop.io.SequenceFile.CompressionType.block at java.lang.Enum.valueOf(Enum.java:238) at org.apache.hadoop.io.SequenceFile$CompressionType.valueOf(SequenceFile.java:220) at org.apache.hadoop.hive.ql.io.HiveFileFormatUtils.getHiveRecordWriter(HiveFileFormatUtils.java:242) ... 16 more Driver stacktrace: at org.apache.spark.scheduler.DAGScheduler.org$apache$spark$scheduler$DAGScheduler$$failJobAndIndependentStages(DAGScheduler.scala:1499) at org.apache.spark.scheduler.DAGScheduler$$anonfun$abortStage$1.apply(DAGScheduler.scala:1487) at org.apache.spark.scheduler.DAGScheduler$$anonfun$abortStage$1.apply(DAGScheduler.scala:1486) at scala.collection.mutable.ResizableArray$class.foreach(ResizableArray.scala:59) at scala.collection.mutable.ArrayBuffer.foreach(ArrayBuffer.scala:48) at org.apache.spark.scheduler.DAGScheduler.abortStage(DAGScheduler.scala:1486) at org.apache.spark.scheduler.DAGScheduler$$anonfun$handleTaskSetFailed$1.apply(DAGScheduler.scala:814) at org.apache.spark.scheduler.DAGScheduler$$anonfun$handleTaskSetFailed$1.apply(DAGScheduler.scala:814) at scala.Option.foreach(Option.scala:257) at org.apache.spark.scheduler.DAGScheduler.handleTaskSetFailed(DAGScheduler.scala:814) at org.apache.spark.scheduler.DAGSchedulerEventProcessLoop.doOnReceive(DAGScheduler.scala:1714) at org.apache.spark.scheduler.DAGSchedulerEventProcessLoop.onReceive(DAGScheduler.scala:1669) at org.apache.spark.scheduler.DAGSchedulerEventProcessLoop.onReceive(DAGScheduler.scala:1658) at org.apache.spark.util.EventLoop$$anon$1.run(EventLoop.scala:48) at org.apache.spark.scheduler.DAGScheduler.runJob(DAGScheduler.scala:630) at org.apache.spark.SparkContext.runJob(SparkContext.scala:2022) at org.apache.spark.sql.execution.datasources.FileFormatWriter$$anonfun$write$1.apply$mcV$sp(FileFormatWriter.scala:188) ... 38 more Caused by: org.apache.spark.SparkException: Task failed while writing rows at org.apache.spark.sql.execution.datasources.FileFormatWriter$.org$apache$spark$sql$execution$datasources$FileFormatWriter$$executeTask(FileFormatWriter.scala:272) at org.apache.spark.sql.execution.datasources.FileFormatWriter$$anonfun$write$1$$anonfun$apply$mcV$sp$1.apply(FileFormatWriter.scala:191) at org.apache.spark.sql.execution.datasources.FileFormatWriter$$anonfun$write$1$$anonfun$apply$mcV$sp$1.apply(FileFormatWriter.scala:190) at org.apache.spark.scheduler.ResultTask.runTask(ResultTask.scala:87) at org.apache.spark.scheduler.Task.run(Task.scala:108) at org.apache.spark.executor.Executor$TaskRunner.run(Executor.scala:335) at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1142) at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:617) at java.lang.Thread.run(Thread.java:748) Caused by: org.apache.hadoop.hive.ql.metadata.HiveException: java.lang.IllegalArgumentException: No enum constant org.apache.hadoop.io.SequenceFile.CompressionType.block at org.apache.hadoop.hive.ql.io.HiveFileFormatUtils.getHiveRecordWriter(HiveFileFormatUtils.java:249) at org.apache.spark.sql.hive.execution.HiveOutputWriter.<init>(HiveFileFormat.scala:123) at org.apache.spark.sql.hive.execution.HiveFileFormat$$anon$1.newInstance(HiveFileFormat.scala:103) at org.apache.spark.sql.execution.datasources.FileFormatWriter$SingleDirectoryWriteTask.newOutputWriter(FileFormatWriter.scala:305) at org.apache.spark.sql.execution.datasources.FileFormatWriter$SingleDirectoryWriteTask.execute(FileFormatWriter.scala:314) at org.apache.spark.sql.execution.datasources.FileFormatWriter$$anonfun$org$apache$spark$sql$execution$datasources$FileFormatWriter$$executeTask$3.apply(FileFormatWriter.scala:258) at org.apache.spark.sql.execution.datasources.FileFormatWriter$$anonfun$org$apache$spark$sql$execution$datasources$FileFormatWriter$$executeTask$3.apply(FileFormatWriter.scala:256) at org.apache.spark.util.Utils$.tryWithSafeFinallyAndFailureCallbacks(Utils.scala:1375) at org.apache.spark.sql.execution.datasources.FileFormatWriter$.org$apache$spark$sql$execution$datasources$FileFormatWriter$$executeTask(FileFormatWriter.scala:261) ... 8 more Caused by: java.lang.IllegalArgumentException: No enum constant org.apache.hadoop.io.SequenceFile.CompressionType.block at java.lang.Enum.valueOf(Enum.java:238) at org.apache.hadoop.io.SequenceFile$CompressionType.valueOf(SequenceFile.java:220) at org.apache.hadoop.hive.ql.io.HiveFileFormatUtils.getHiveRecordWriter(HiveFileFormatUtils.java:242) ... 16 more spark-sql> ```

spark 读取不到hive metastore 获取不到数据库

直接上异常 ``` Using Spark's default log4j profile: org/apache/spark/log4j-defaults.properties SLF4J: Class path contains multiple SLF4J bindings. SLF4J: Found binding in [jar:file:/data01/hadoop/yarn/local/filecache/355/spark2-hdp-yarn-archive.tar.gz/slf4j-log4j12-1.7.16.jar!/org/slf4j/impl/StaticLoggerBinder.class] SLF4J: Found binding in [jar:file:/usr/hdp/2.6.5.0-292/hadoop/lib/slf4j-log4j12-1.7.10.jar!/org/slf4j/impl/StaticLoggerBinder.class] SLF4J: See http://www.slf4j.org/codes.html#multiple_bindings for an explanation. SLF4J: Actual binding is of type [org.slf4j.impl.Log4jLoggerFactory] 19/08/13 19:53:17 INFO SignalUtils: Registered signal handler for TERM 19/08/13 19:53:17 INFO SignalUtils: Registered signal handler for HUP 19/08/13 19:53:17 INFO SignalUtils: Registered signal handler for INT 19/08/13 19:53:17 INFO SecurityManager: Changing view acls to: yarn,hdfs 19/08/13 19:53:17 INFO SecurityManager: Changing modify acls to: yarn,hdfs 19/08/13 19:53:17 INFO SecurityManager: Changing view acls groups to: 19/08/13 19:53:17 INFO SecurityManager: Changing modify acls groups to: 19/08/13 19:53:17 INFO SecurityManager: SecurityManager: authentication disabled; ui acls disabled; users with view permissions: Set(yarn, hdfs); groups with view permissions: Set(); users with modify permissions: Set(yarn, hdfs); groups with modify permissions: Set() 19/08/13 19:53:18 INFO ApplicationMaster: Preparing Local resources 19/08/13 19:53:19 INFO ApplicationMaster: ApplicationAttemptId: appattempt_1565610088533_0087_000001 19/08/13 19:53:19 INFO ApplicationMaster: Starting the user application in a separate Thread 19/08/13 19:53:19 INFO ApplicationMaster: Waiting for spark context initialization... 19/08/13 19:53:19 INFO SparkContext: Running Spark version 2.3.0.2.6.5.0-292 19/08/13 19:53:19 INFO SparkContext: Submitted application: voice_stream 19/08/13 19:53:19 INFO SecurityManager: Changing view acls to: yarn,hdfs 19/08/13 19:53:19 INFO SecurityManager: Changing modify acls to: yarn,hdfs 19/08/13 19:53:19 INFO SecurityManager: Changing view acls groups to: 19/08/13 19:53:19 INFO SecurityManager: Changing modify acls groups to: 19/08/13 19:53:19 INFO SecurityManager: SecurityManager: authentication disabled; ui acls disabled; users with view permissions: Set(yarn, hdfs); groups with view permissions: Set(); users with modify permissions: Set(yarn, hdfs); groups with modify permissions: Set() 19/08/13 19:53:19 INFO Utils: Successfully started service 'sparkDriver' on port 20410. 19/08/13 19:53:19 INFO SparkEnv: Registering MapOutputTracker 19/08/13 19:53:19 INFO SparkEnv: Registering BlockManagerMaster 19/08/13 19:53:19 INFO BlockManagerMasterEndpoint: Using org.apache.spark.storage.DefaultTopologyMapper for getting topology information 19/08/13 19:53:19 INFO BlockManagerMasterEndpoint: BlockManagerMasterEndpoint up 19/08/13 19:53:19 INFO DiskBlockManager: Created local directory at /data01/hadoop/yarn/local/usercache/hdfs/appcache/application_1565610088533_0087/blockmgr-94d35b97-43b2-496e-a4cb-73ecd3ed186c 19/08/13 19:53:19 INFO MemoryStore: MemoryStore started with capacity 366.3 MB 19/08/13 19:53:19 INFO SparkEnv: Registering OutputCommitCoordinator 19/08/13 19:53:19 INFO JettyUtils: Adding filter: org.apache.hadoop.yarn.server.webproxy.amfilter.AmIpFilter 19/08/13 19:53:19 INFO Utils: Successfully started service 'SparkUI' on port 28852. 19/08/13 19:53:19 INFO SparkUI: Bound SparkUI to 0.0.0.0, and started at http://datanode02:28852 19/08/13 19:53:19 INFO YarnClusterScheduler: Created YarnClusterScheduler 19/08/13 19:53:20 INFO SchedulerExtensionServices: Starting Yarn extension services with app application_1565610088533_0087 and attemptId Some(appattempt_1565610088533_0087_000001) 19/08/13 19:53:20 INFO Utils: Successfully started service 'org.apache.spark.network.netty.NettyBlockTransferService' on port 31984. 19/08/13 19:53:20 INFO NettyBlockTransferService: Server created on datanode02:31984 19/08/13 19:53:20 INFO BlockManager: Using org.apache.spark.storage.RandomBlockReplicationPolicy for block replication policy 19/08/13 19:53:20 INFO BlockManagerMaster: Registering BlockManager BlockManagerId(driver, datanode02, 31984, None) 19/08/13 19:53:20 INFO BlockManagerMasterEndpoint: Registering block manager datanode02:31984 with 366.3 MB RAM, BlockManagerId(driver, datanode02, 31984, None) 19/08/13 19:53:20 INFO BlockManagerMaster: Registered BlockManager BlockManagerId(driver, datanode02, 31984, None) 19/08/13 19:53:20 INFO BlockManager: Initialized BlockManager: BlockManagerId(driver, datanode02, 31984, None) 19/08/13 19:53:20 INFO EventLoggingListener: Logging events to hdfs:/spark2-history/application_1565610088533_0087_1 19/08/13 19:53:20 INFO ApplicationMaster: =============================================================================== YARN executor launch context: env: CLASSPATH -> {{PWD}}<CPS>{{PWD}}/__spark_conf__<CPS>{{PWD}}/__spark_libs__/*<CPS>/usr/hdp/2.6.5.0-292/hadoop/conf<CPS>/usr/hdp/2.6.5.0-292/hadoop/*<CPS>/usr/hdp/2.6.5.0-292/hadoop/lib/*<CPS>/usr/hdp/current/hadoop-hdfs-client/*<CPS>/usr/hdp/current/hadoop-hdfs-client/lib/*<CPS>/usr/hdp/current/hadoop-yarn-client/*<CPS>/usr/hdp/current/hadoop-yarn-client/lib/*<CPS>/usr/hdp/current/ext/hadoop/*<CPS>$PWD/mr-framework/hadoop/share/hadoop/mapreduce/*:$PWD/mr-framework/hadoop/share/hadoop/mapreduce/lib/*:$PWD/mr-framework/hadoop/share/hadoop/common/*:$PWD/mr-framework/hadoop/share/hadoop/common/lib/*:$PWD/mr-framework/hadoop/share/hadoop/yarn/*:$PWD/mr-framework/hadoop/share/hadoop/yarn/lib/*:$PWD/mr-framework/hadoop/share/hadoop/hdfs/*:$PWD/mr-framework/hadoop/share/hadoop/hdfs/lib/*:$PWD/mr-framework/hadoop/share/hadoop/tools/lib/*:/usr/hdp/2.6.5.0-292/hadoop/lib/hadoop-lzo-0.6.0.2.6.5.0-292.jar:/etc/hadoop/conf/secure:/usr/hdp/current/ext/hadoop/*<CPS>{{PWD}}/__spark_conf__/__hadoop_conf__ SPARK_YARN_STAGING_DIR -> *********(redacted) SPARK_USER -> *********(redacted) command: LD_LIBRARY_PATH="/usr/hdp/current/hadoop-client/lib/native:/usr/hdp/current/hadoop-client/lib/native/Linux-amd64-64:$LD_LIBRARY_PATH" \ {{JAVA_HOME}}/bin/java \ -server \ -Xmx5120m \ -Djava.io.tmpdir={{PWD}}/tmp \ '-Dspark.history.ui.port=18081' \ '-Dspark.rpc.message.maxSize=100' \ -Dspark.yarn.app.container.log.dir=<LOG_DIR> \ -XX:OnOutOfMemoryError='kill %p' \ org.apache.spark.executor.CoarseGrainedExecutorBackend \ --driver-url \ spark://CoarseGrainedScheduler@datanode02:20410 \ --executor-id \ <executorId> \ --hostname \ <hostname> \ --cores \ 2 \ --app-id \ application_1565610088533_0087 \ --user-class-path \ file:$PWD/__app__.jar \ --user-class-path \ file:$PWD/hadoop-common-2.7.3.jar \ --user-class-path \ file:$PWD/guava-12.0.1.jar \ --user-class-path \ file:$PWD/hbase-server-1.2.8.jar \ --user-class-path \ file:$PWD/hbase-protocol-1.2.8.jar \ --user-class-path \ file:$PWD/hbase-client-1.2.8.jar \ --user-class-path \ file:$PWD/hbase-common-1.2.8.jar \ --user-class-path \ file:$PWD/mysql-connector-java-5.1.44-bin.jar \ --user-class-path \ file:$PWD/spark-streaming-kafka-0-8-assembly_2.11-2.3.2.jar \ --user-class-path \ file:$PWD/spark-examples_2.11-1.6.0-typesafe-001.jar \ --user-class-path \ file:$PWD/fastjson-1.2.7.jar \ 1><LOG_DIR>/stdout \ 2><LOG_DIR>/stderr resources: spark-streaming-kafka-0-8-assembly_2.11-2.3.2.jar -> resource { scheme: "hdfs" host: "CID-042fb939-95b4-4b74-91b8-9f94b999bdf7" port: -1 file: "/user/hdfs/.sparkStaging/application_1565610088533_0087/spark-streaming-kafka-0-8-assembly_2.11-2.3.2.jar" } size: 12271027 timestamp: 1565697198603 type: FILE visibility: PRIVATE spark-examples_2.11-1.6.0-typesafe-001.jar -> resource { scheme: "hdfs" host: "CID-042fb939-95b4-4b74-91b8-9f94b999bdf7" port: -1 file: "/user/hdfs/.sparkStaging/application_1565610088533_0087/spark-examples_2.11-1.6.0-typesafe-001.jar" } size: 1867746 timestamp: 1565697198751 type: FILE visibility: PRIVATE hbase-server-1.2.8.jar -> resource { scheme: "hdfs" host: "CID-042fb939-95b4-4b74-91b8-9f94b999bdf7" port: -1 file: "/user/hdfs/.sparkStaging/application_1565610088533_0087/hbase-server-1.2.8.jar" } size: 4197896 timestamp: 1565697197770 type: FILE visibility: PRIVATE hbase-common-1.2.8.jar -> resource { scheme: "hdfs" host: "CID-042fb939-95b4-4b74-91b8-9f94b999bdf7" port: -1 file: "/user/hdfs/.sparkStaging/application_1565610088533_0087/hbase-common-1.2.8.jar" } size: 570163 timestamp: 1565697198318 type: FILE visibility: PRIVATE __app__.jar -> resource { scheme: "hdfs" host: "CID-042fb939-95b4-4b74-91b8-9f94b999bdf7" port: -1 file: "/user/hdfs/.sparkStaging/application_1565610088533_0087/spark_history_data2.jar" } size: 44924 timestamp: 1565697197260 type: FILE visibility: PRIVATE guava-12.0.1.jar -> resource { scheme: "hdfs" host: "CID-042fb939-95b4-4b74-91b8-9f94b999bdf7" port: -1 file: "/user/hdfs/.sparkStaging/application_1565610088533_0087/guava-12.0.1.jar" } size: 1795932 timestamp: 1565697197614 type: FILE visibility: PRIVATE hbase-client-1.2.8.jar -> resource { scheme: "hdfs" host: "CID-042fb939-95b4-4b74-91b8-9f94b999bdf7" port: -1 file: "/user/hdfs/.sparkStaging/application_1565610088533_0087/hbase-client-1.2.8.jar" } size: 1306401 timestamp: 1565697198180 type: FILE visibility: PRIVATE __spark_conf__ -> resource { scheme: "hdfs" host: "CID-042fb939-95b4-4b74-91b8-9f94b999bdf7" port: -1 file: "/user/hdfs/.sparkStaging/application_1565610088533_0087/__spark_conf__.zip" } size: 273513 timestamp: 1565697199131 type: ARCHIVE visibility: PRIVATE fastjson-1.2.7.jar -> resource { scheme: "hdfs" host: "CID-042fb939-95b4-4b74-91b8-9f94b999bdf7" port: -1 file: "/user/hdfs/.sparkStaging/application_1565610088533_0087/fastjson-1.2.7.jar" } size: 417221 timestamp: 1565697198865 type: FILE visibility: PRIVATE hbase-protocol-1.2.8.jar -> resource { scheme: "hdfs" host: "CID-042fb939-95b4-4b74-91b8-9f94b999bdf7" port: -1 file: "/user/hdfs/.sparkStaging/application_1565610088533_0087/hbase-protocol-1.2.8.jar" } size: 4366252 timestamp: 1565697198023 type: FILE visibility: PRIVATE __spark_libs__ -> resource { scheme: "hdfs" host: "CID-042fb939-95b4-4b74-91b8-9f94b999bdf7" port: -1 file: "/hdp/apps/2.6.5.0-292/spark2/spark2-hdp-yarn-archive.tar.gz" } size: 227600110 timestamp: 1549953820247 type: ARCHIVE visibility: PUBLIC mysql-connector-java-5.1.44-bin.jar -> resource { scheme: "hdfs" host: "CID-042fb939-95b4-4b74-91b8-9f94b999bdf7" port: -1 file: "/user/hdfs/.sparkStaging/application_1565610088533_0087/mysql-connector-java-5.1.44-bin.jar" } size: 999635 timestamp: 1565697198445 type: FILE visibility: PRIVATE hadoop-common-2.7.3.jar -> resource { scheme: "hdfs" host: "CID-042fb939-95b4-4b74-91b8-9f94b999bdf7" port: -1 file: "/user/hdfs/.sparkStaging/application_1565610088533_0087/hadoop-common-2.7.3.jar" } size: 3479293 timestamp: 1565697197476 type: FILE visibility: PRIVATE =============================================================================== 19/08/13 19:53:20 INFO RMProxy: Connecting to ResourceManager at namenode02/10.1.38.38:8030 19/08/13 19:53:20 INFO YarnRMClient: Registering the ApplicationMaster 19/08/13 19:53:20 INFO YarnAllocator: Will request 3 executor container(s), each with 2 core(s) and 5632 MB memory (including 512 MB of overhead) 19/08/13 19:53:20 INFO YarnSchedulerBackend$YarnSchedulerEndpoint: ApplicationMaster registered as NettyRpcEndpointRef(spark://YarnAM@datanode02:20410) 19/08/13 19:53:20 INFO YarnAllocator: Submitted 3 unlocalized container requests. 19/08/13 19:53:20 INFO ApplicationMaster: Started progress reporter thread with (heartbeat : 3000, initial allocation : 200) intervals 19/08/13 19:53:20 INFO AMRMClientImpl: Received new token for : datanode03:45454 19/08/13 19:53:21 INFO YarnAllocator: Launching container container_e20_1565610088533_0087_01_000002 on host datanode03 for executor with ID 1 19/08/13 19:53:21 INFO YarnAllocator: Received 1 containers from YARN, launching executors on 1 of them. 19/08/13 19:53:21 INFO ContainerManagementProtocolProxy: yarn.client.max-cached-nodemanagers-proxies : 0 19/08/13 19:53:21 INFO ContainerManagementProtocolProxy: Opening proxy : datanode03:45454 19/08/13 19:53:21 INFO AMRMClientImpl: Received new token for : datanode01:45454 19/08/13 19:53:21 INFO YarnAllocator: Launching container container_e20_1565610088533_0087_01_000003 on host datanode01 for executor with ID 2 19/08/13 19:53:21 INFO YarnAllocator: Received 1 containers from YARN, launching executors on 1 of them. 19/08/13 19:53:21 INFO ContainerManagementProtocolProxy: yarn.client.max-cached-nodemanagers-proxies : 0 19/08/13 19:53:21 INFO ContainerManagementProtocolProxy: Opening proxy : datanode01:45454 19/08/13 19:53:22 INFO AMRMClientImpl: Received new token for : datanode02:45454 19/08/13 19:53:22 INFO YarnAllocator: Launching container container_e20_1565610088533_0087_01_000004 on host datanode02 for executor with ID 3 19/08/13 19:53:22 INFO YarnAllocator: Received 1 containers from YARN, launching executors on 1 of them. 19/08/13 19:53:22 INFO ContainerManagementProtocolProxy: yarn.client.max-cached-nodemanagers-proxies : 0 19/08/13 19:53:22 INFO ContainerManagementProtocolProxy: Opening proxy : datanode02:45454 19/08/13 19:53:24 INFO YarnSchedulerBackend$YarnDriverEndpoint: Registered executor NettyRpcEndpointRef(spark-client://Executor) (10.1.198.144:41122) with ID 1 19/08/13 19:53:25 INFO YarnSchedulerBackend$YarnDriverEndpoint: Registered executor NettyRpcEndpointRef(spark-client://Executor) (10.1.229.163:24656) with ID 3 19/08/13 19:53:25 INFO BlockManagerMasterEndpoint: Registering block manager datanode03:3328 with 2.5 GB RAM, BlockManagerId(1, datanode03, 3328, None) 19/08/13 19:53:25 INFO BlockManagerMasterEndpoint: Registering block manager datanode02:28863 with 2.5 GB RAM, BlockManagerId(3, datanode02, 28863, None) 19/08/13 19:53:25 INFO YarnSchedulerBackend$YarnDriverEndpoint: Registered executor NettyRpcEndpointRef(spark-client://Executor) (10.1.229.158:64276) with ID 2 19/08/13 19:53:25 INFO YarnClusterSchedulerBackend: SchedulerBackend is ready for scheduling beginning after reached minRegisteredResourcesRatio: 0.8 19/08/13 19:53:25 INFO YarnClusterScheduler: YarnClusterScheduler.postStartHook done 19/08/13 19:53:25 INFO BlockManagerMasterEndpoint: Registering block manager datanode01:20487 with 2.5 GB RAM, BlockManagerId(2, datanode01, 20487, None) 19/08/13 19:53:25 WARN SparkContext: Using an existing SparkContext; some configuration may not take effect. 19/08/13 19:53:25 INFO SparkContext: Starting job: start at VoiceApplication2.java:128 19/08/13 19:53:25 INFO DAGScheduler: Registering RDD 1 (start at VoiceApplication2.java:128) 19/08/13 19:53:25 INFO DAGScheduler: Got job 0 (start at VoiceApplication2.java:128) with 20 output partitions 19/08/13 19:53:25 INFO DAGScheduler: Final stage: ResultStage 1 (start at VoiceApplication2.java:128) 19/08/13 19:53:25 INFO DAGScheduler: Parents of final stage: List(ShuffleMapStage 0) 19/08/13 19:53:25 INFO DAGScheduler: Missing parents: List(ShuffleMapStage 0) 19/08/13 19:53:26 INFO DAGScheduler: Submitting ShuffleMapStage 0 (MapPartitionsRDD[1] at start at VoiceApplication2.java:128), which has no missing parents 19/08/13 19:53:26 INFO MemoryStore: Block broadcast_0 stored as values in memory (estimated size 3.1 KB, free 366.3 MB) 19/08/13 19:53:26 INFO MemoryStore: Block broadcast_0_piece0 stored as bytes in memory (estimated size 2011.0 B, free 366.3 MB) 19/08/13 19:53:26 INFO BlockManagerInfo: Added broadcast_0_piece0 in memory on datanode02:31984 (size: 2011.0 B, free: 366.3 MB) 19/08/13 19:53:26 INFO SparkContext: Created broadcast 0 from broadcast at DAGScheduler.scala:1039 19/08/13 19:53:26 INFO DAGScheduler: Submitting 50 missing tasks from ShuffleMapStage 0 (MapPartitionsRDD[1] at start at VoiceApplication2.java:128) (first 15 tasks are for partitions Vector(0, 1, 2, 3, 4, 5, 6, 7, 8, 9, 10, 11, 12, 13, 14)) 19/08/13 19:53:26 INFO YarnClusterScheduler: Adding task set 0.0 with 50 tasks 19/08/13 19:53:26 INFO TaskSetManager: Starting task 0.0 in stage 0.0 (TID 0, datanode02, executor 3, partition 0, PROCESS_LOCAL, 7831 bytes) 19/08/13 19:53:26 INFO TaskSetManager: Starting task 1.0 in stage 0.0 (TID 1, datanode03, executor 1, partition 1, PROCESS_LOCAL, 7831 bytes) 19/08/13 19:53:26 INFO TaskSetManager: Starting task 2.0 in stage 0.0 (TID 2, datanode01, executor 2, partition 2, PROCESS_LOCAL, 7831 bytes) 19/08/13 19:53:26 INFO TaskSetManager: Starting task 3.0 in stage 0.0 (TID 3, datanode02, executor 3, partition 3, PROCESS_LOCAL, 7831 bytes) 19/08/13 19:53:26 INFO TaskSetManager: Starting task 4.0 in stage 0.0 (TID 4, datanode03, executor 1, partition 4, PROCESS_LOCAL, 7831 bytes) 19/08/13 19:53:26 INFO TaskSetManager: Starting task 5.0 in stage 0.0 (TID 5, datanode01, executor 2, partition 5, PROCESS_LOCAL, 7831 bytes) 19/08/13 19:53:26 INFO BlockManagerInfo: Added broadcast_0_piece0 in memory on datanode02:28863 (size: 2011.0 B, free: 2.5 GB) 19/08/13 19:53:26 INFO BlockManagerInfo: Added broadcast_0_piece0 in memory on datanode03:3328 (size: 2011.0 B, free: 2.5 GB) 19/08/13 19:53:26 INFO BlockManagerInfo: Added broadcast_0_piece0 in memory on datanode01:20487 (size: 2011.0 B, free: 2.5 GB) 19/08/13 19:53:26 INFO TaskSetManager: Starting task 6.0 in stage 0.0 (TID 6, datanode02, executor 3, partition 6, PROCESS_LOCAL, 7831 bytes) 19/08/13 19:53:26 INFO TaskSetManager: Starting task 7.0 in stage 0.0 (TID 7, datanode02, executor 3, partition 7, PROCESS_LOCAL, 7831 bytes) 19/08/13 19:53:26 INFO TaskSetManager: Finished task 3.0 in stage 0.0 (TID 3) in 693 ms on datanode02 (executor 3) (1/50) 19/08/13 19:53:26 INFO TaskSetManager: Finished task 0.0 in stage 0.0 (TID 0) in 712 ms on datanode02 (executor 3) (2/50) 19/08/13 19:53:26 INFO TaskSetManager: Starting task 8.0 in stage 0.0 (TID 8, datanode02, executor 3, partition 8, PROCESS_LOCAL, 7831 bytes) 19/08/13 19:53:26 INFO TaskSetManager: Finished task 7.0 in stage 0.0 (TID 7) in 21 ms on datanode02 (executor 3) (3/50) 19/08/13 19:53:26 INFO TaskSetManager: Starting task 9.0 in stage 0.0 (TID 9, datanode02, executor 3, partition 9, PROCESS_LOCAL, 7831 bytes) 19/08/13 19:53:26 INFO TaskSetManager: Finished task 6.0 in stage 0.0 (TID 6) in 26 ms on datanode02 (executor 3) (4/50) 19/08/13 19:53:26 INFO TaskSetManager: Starting task 10.0 in stage 0.0 (TID 10, datanode02, executor 3, partition 10, PROCESS_LOCAL, 7831 bytes) 19/08/13 19:53:26 INFO TaskSetManager: Finished task 8.0 in stage 0.0 (TID 8) in 23 ms on datanode02 (executor 3) (5/50) 19/08/13 19:53:26 INFO TaskSetManager: Starting task 11.0 in stage 0.0 (TID 11, datanode02, executor 3, partition 11, PROCESS_LOCAL, 7831 bytes) 19/08/13 19:53:26 INFO TaskSetManager: Finished task 9.0 in stage 0.0 (TID 9) in 25 ms on datanode02 (executor 3) (6/50) 19/08/13 19:53:26 INFO TaskSetManager: Starting task 12.0 in stage 0.0 (TID 12, datanode02, executor 3, partition 12, PROCESS_LOCAL, 7831 bytes) 19/08/13 19:53:26 INFO TaskSetManager: Finished task 10.0 in stage 0.0 (TID 10) in 18 ms on datanode02 (executor 3) (7/50) 19/08/13 19:53:26 INFO TaskSetManager: Finished task 11.0 in stage 0.0 (TID 11) in 14 ms on datanode02 (executor 3) (8/50) 19/08/13 19:53:26 INFO TaskSetManager: Starting task 13.0 in stage 0.0 (TID 13, datanode02, executor 3, partition 13, PROCESS_LOCAL, 7831 bytes) 19/08/13 19:53:26 INFO TaskSetManager: Starting task 14.0 in stage 0.0 (TID 14, datanode02, executor 3, partition 14, PROCESS_LOCAL, 7831 bytes) 19/08/13 19:53:26 INFO TaskSetManager: Finished task 12.0 in stage 0.0 (TID 12) in 16 ms on datanode02 (executor 3) (9/50) 19/08/13 19:53:26 INFO TaskSetManager: Starting task 15.0 in stage 0.0 (TID 15, datanode02, executor 3, partition 15, PROCESS_LOCAL, 7831 bytes) 19/08/13 19:53:26 INFO TaskSetManager: Finished task 13.0 in stage 0.0 (TID 13) in 22 ms on datanode02 (executor 3) (10/50) 19/08/13 19:53:26 INFO TaskSetManager: Starting task 16.0 in stage 0.0 (TID 16, datanode02, executor 3, partition 16, PROCESS_LOCAL, 7831 bytes) 19/08/13 19:53:26 INFO TaskSetManager: Finished task 14.0 in stage 0.0 (TID 14) in 16 ms on datanode02 (executor 3) (11/50) 19/08/13 19:53:26 INFO TaskSetManager: Starting task 17.0 in stage 0.0 (TID 17, datanode02, executor 3, partition 17, PROCESS_LOCAL, 7831 bytes) 19/08/13 19:53:26 INFO TaskSetManager: Finished task 15.0 in stage 0.0 (TID 15) in 13 ms on datanode02 (executor 3) (12/50) 19/08/13 19:53:26 INFO TaskSetManager: Starting task 18.0 in stage 0.0 (TID 18, datanode01, executor 2, partition 18, PROCESS_LOCAL, 7831 bytes) 19/08/13 19:53:26 INFO TaskSetManager: Starting task 19.0 in stage 0.0 (TID 19, datanode01, executor 2, partition 19, PROCESS_LOCAL, 7831 bytes) 19/08/13 19:53:26 INFO TaskSetManager: Finished task 5.0 in stage 0.0 (TID 5) in 787 ms on datanode01 (executor 2) (13/50) 19/08/13 19:53:26 INFO TaskSetManager: Finished task 2.0 in stage 0.0 (TID 2) in 789 ms on datanode01 (executor 2) (14/50) 19/08/13 19:53:26 INFO TaskSetManager: Starting task 20.0 in stage 0.0 (TID 20, datanode03, executor 1, partition 20, PROCESS_LOCAL, 7831 bytes) 19/08/13 19:53:26 INFO TaskSetManager: Starting task 21.0 in stage 0.0 (TID 21, datanode03, executor 1, partition 21, PROCESS_LOCAL, 7831 bytes) 19/08/13 19:53:27 INFO TaskSetManager: Finished task 4.0 in stage 0.0 (TID 4) in 905 ms on datanode03 (executor 1) (15/50) 19/08/13 19:53:27 INFO TaskSetManager: Finished task 1.0 in stage 0.0 (TID 1) in 907 ms on datanode03 (executor 1) (16/50) 19/08/13 19:53:27 INFO TaskSetManager: Starting task 22.0 in stage 0.0 (TID 22, datanode02, executor 3, partition 22, PROCESS_LOCAL, 7831 bytes) 19/08/13 19:53:27 INFO TaskSetManager: Starting task 23.0 in stage 0.0 (TID 23, datanode02, executor 3, partition 23, PROCESS_LOCAL, 7831 bytes) 19/08/13 19:53:27 INFO TaskSetManager: Starting task 24.0 in stage 0.0 (TID 24, datanode01, executor 2, partition 24, PROCESS_LOCAL, 7831 bytes) 19/08/13 19:53:27 INFO TaskSetManager: Finished task 18.0 in stage 0.0 (TID 18) in 124 ms on datanode01 (executor 2) (17/50) 19/08/13 19:53:27 INFO TaskSetManager: Finished task 16.0 in stage 0.0 (TID 16) in 134 ms on datanode02 (executor 3) (18/50) 19/08/13 19:53:27 INFO TaskSetManager: Starting task 25.0 in stage 0.0 (TID 25, datanode01, executor 2, partition 25, PROCESS_LOCAL, 7831 bytes) 19/08/13 19:53:27 INFO TaskSetManager: Starting task 26.0 in stage 0.0 (TID 26, datanode03, executor 1, partition 26, PROCESS_LOCAL, 7831 bytes) 19/08/13 19:53:27 INFO TaskSetManager: Finished task 17.0 in stage 0.0 (TID 17) in 134 ms on datanode02 (executor 3) (19/50) 19/08/13 19:53:27 INFO TaskSetManager: Finished task 20.0 in stage 0.0 (TID 20) in 122 ms on datanode03 (executor 1) (20/50) 19/08/13 19:53:27 INFO TaskSetManager: Starting task 27.0 in stage 0.0 (TID 27, datanode03, executor 1, partition 27, PROCESS_LOCAL, 7831 bytes) 19/08/13 19:53:27 INFO TaskSetManager: Finished task 19.0 in stage 0.0 (TID 19) in 127 ms on datanode01 (executor 2) (21/50) 19/08/13 19:53:27 INFO TaskSetManager: Finished task 21.0 in stage 0.0 (TID 21) in 123 ms on datanode03 (executor 1) (22/50) 19/08/13 19:53:27 INFO TaskSetManager: Starting task 28.0 in stage 0.0 (TID 28, datanode02, executor 3, partition 28, PROCESS_LOCAL, 7831 bytes) 19/08/13 19:53:27 INFO TaskSetManager: Starting task 29.0 in stage 0.0 (TID 29, datanode02, executor 3, partition 29, PROCESS_LOCAL, 7831 bytes) 19/08/13 19:53:27 INFO TaskSetManager: Finished task 22.0 in stage 0.0 (TID 22) in 19 ms on datanode02 (executor 3) (23/50) 19/08/13 19:53:27 INFO TaskSetManager: Finished task 23.0 in stage 0.0 (TID 23) in 18 ms on datanode02 (executor 3) (24/50) 19/08/13 19:53:27 INFO TaskSetManager: Starting task 30.0 in stage 0.0 (TID 30, datanode01, executor 2, partition 30, PROCESS_LOCAL, 7831 bytes) 19/08/13 19:53:27 INFO TaskSetManager: Starting task 31.0 in stage 0.0 (TID 31, datanode01, executor 2, partition 31, PROCESS_LOCAL, 7831 bytes) 19/08/13 19:53:27 INFO TaskSetManager: Finished task 25.0 in stage 0.0 (TID 25) in 27 ms on datanode01 (executor 2) (25/50) 19/08/13 19:53:27 INFO TaskSetManager: Finished task 24.0 in stage 0.0 (TID 24) in 29 ms on datanode01 (executor 2) (26/50) 19/08/13 19:53:27 INFO TaskSetManager: Starting task 32.0 in stage 0.0 (TID 32, datanode02, executor 3, partition 32, PROCESS_LOCAL, 7831 bytes) 19/08/13 19:53:27 INFO TaskSetManager: Finished task 29.0 in stage 0.0 (TID 29) in 16 ms on datanode02 (executor 3) (27/50) 19/08/13 19:53:27 INFO TaskSetManager: Starting task 33.0 in stage 0.0 (TID 33, datanode03, executor 1, partition 33, PROCESS_LOCAL, 7831 bytes) 19/08/13 19:53:27 INFO TaskSetManager: Finished task 26.0 in stage 0.0 (TID 26) in 30 ms on datanode03 (executor 1) (28/50) 19/08/13 19:53:27 INFO TaskSetManager: Starting task 34.0 in stage 0.0 (TID 34, datanode02, executor 3, partition 34, PROCESS_LOCAL, 7831 bytes) 19/08/13 19:53:27 INFO TaskSetManager: Finished task 28.0 in stage 0.0 (TID 28) in 21 ms on datanode02 (executor 3) (29/50) 19/08/13 19:53:27 INFO TaskSetManager: Starting task 35.0 in stage 0.0 (TID 35, datanode03, executor 1, partition 35, PROCESS_LOCAL, 7831 bytes) 19/08/13 19:53:27 INFO TaskSetManager: Finished task 27.0 in stage 0.0 (TID 27) in 32 ms on datanode03 (executor 1) (30/50) 19/08/13 19:53:27 INFO TaskSetManager: Starting task 36.0 in stage 0.0 (TID 36, datanode02, executor 3, partition 36, PROCESS_LOCAL, 7831 bytes) 19/08/13 19:53:27 INFO TaskSetManager: Finished task 32.0 in stage 0.0 (TID 32) in 11 ms on datanode02 (executor 3) (31/50) 19/08/13 19:53:27 INFO TaskSetManager: Starting task 37.0 in stage 0.0 (TID 37, datanode01, executor 2, partition 37, PROCESS_LOCAL, 7831 bytes) 19/08/13 19:53:27 INFO TaskSetManager: Finished task 30.0 in stage 0.0 (TID 30) in 18 ms on datanode01 (executor 2) (32/50) 19/08/13 19:53:27 INFO TaskSetManager: Starting task 38.0 in stage 0.0 (TID 38, datanode01, executor 2, partition 38, PROCESS_LOCAL, 7831 bytes) 19/08/13 19:53:27 INFO TaskSetManager: Finished task 31.0 in stage 0.0 (TID 31) in 20 ms on datanode01 (executor 2) (33/50) 19/08/13 19:53:27 INFO TaskSetManager: Starting task 39.0 in stage 0.0 (TID 39, datanode03, executor 1, partition 39, PROCESS_LOCAL, 7831 bytes) 19/08/13 19:53:27 INFO TaskSetManager: Finished task 33.0 in stage 0.0 (TID 33) in 17 ms on datanode03 (executor 1) (34/50) 19/08/13 19:53:27 INFO TaskSetManager: Finished task 34.0 in stage 0.0 (TID 34) in 17 ms on datanode02 (executor 3) (35/50) 19/08/13 19:53:27 INFO TaskSetManager: Starting task 40.0 in stage 0.0 (TID 40, datanode02, executor 3, partition 40, PROCESS_LOCAL, 7831 bytes) 19/08/13 19:53:27 INFO TaskSetManager: Starting task 41.0 in stage 0.0 (TID 41, datanode03, executor 1, partition 41, PROCESS_LOCAL, 7831 bytes) 19/08/13 19:53:27 INFO TaskSetManager: Finished task 35.0 in stage 0.0 (TID 35) in 17 ms on datanode03 (executor 1) (36/50) 19/08/13 19:53:27 INFO TaskSetManager: Starting task 42.0 in stage 0.0 (TID 42, datanode02, executor 3, partition 42, PROCESS_LOCAL, 7831 bytes) 19/08/13 19:53:27 INFO TaskSetManager: Finished task 36.0 in stage 0.0 (TID 36) in 16 ms on datanode02 (executor 3) (37/50) 19/08/13 19:53:27 INFO TaskSetManager: Starting task 43.0 in stage 0.0 (TID 43, datanode01, executor 2, partition 43, PROCESS_LOCAL, 7831 bytes) 19/08/13 19:53:27 INFO TaskSetManager: Finished task 37.0 in stage 0.0 (TID 37) in 16 ms on datanode01 (executor 2) (38/50) 19/08/13 19:53:27 INFO TaskSetManager: Starting task 44.0 in stage 0.0 (TID 44, datanode02, executor 3, partition 44, PROCESS_LOCAL, 7831 bytes) 19/08/13 19:53:27 INFO TaskSetManager: Starting task 45.0 in stage 0.0 (TID 45, datanode02, executor 3, partition 45, PROCESS_LOCAL, 7831 bytes) 19/08/13 19:53:27 INFO TaskSetManager: Finished task 40.0 in stage 0.0 (TID 40) in 14 ms on datanode02 (executor 3) (39/50) 19/08/13 19:53:27 INFO TaskSetManager: Finished task 42.0 in stage 0.0 (TID 42) in 11 ms on datanode02 (executor 3) (40/50) 19/08/13 19:53:27 INFO TaskSetManager: Starting task 46.0 in stage 0.0 (TID 46, datanode03, executor 1, partition 46, PROCESS_LOCAL, 7831 bytes) 19/08/13 19:53:27 INFO TaskSetManager: Finished task 39.0 in stage 0.0 (TID 39) in 20 ms on datanode03 (executor 1) (41/50) 19/08/13 19:53:27 INFO TaskSetManager: Starting task 47.0 in stage 0.0 (TID 47, datanode03, executor 1, partition 47, PROCESS_LOCAL, 7831 bytes) 19/08/13 19:53:27 INFO TaskSetManager: Finished task 41.0 in stage 0.0 (TID 41) in 20 ms on datanode03 (executor 1) (42/50) 19/08/13 19:53:27 INFO TaskSetManager: Starting task 48.0 in stage 0.0 (TID 48, datanode01, executor 2, partition 48, PROCESS_LOCAL, 7831 bytes) 19/08/13 19:53:27 INFO TaskSetManager: Starting task 49.0 in stage 0.0 (TID 49, datanode01, executor 2, partition 49, PROCESS_LOCAL, 7888 bytes) 19/08/13 19:53:27 INFO TaskSetManager: Finished task 43.0 in stage 0.0 (TID 43) in 18 ms on datanode01 (executor 2) (43/50) 19/08/13 19:53:27 INFO TaskSetManager: Finished task 38.0 in stage 0.0 (TID 38) in 31 ms on datanode01 (executor 2) (44/50) 19/08/13 19:53:27 INFO TaskSetManager: Finished task 45.0 in stage 0.0 (TID 45) in 11 ms on datanode02 (executor 3) (45/50) 19/08/13 19:53:27 INFO TaskSetManager: Finished task 44.0 in stage 0.0 (TID 44) in 16 ms on datanode02 (executor 3) (46/50) 19/08/13 19:53:27 INFO TaskSetManager: Finished task 46.0 in stage 0.0 (TID 46) in 18 ms on datanode03 (executor 1) (47/50) 19/08/13 19:53:27 INFO TaskSetManager: Finished task 48.0 in stage 0.0 (TID 48) in 15 ms on datanode01 (executor 2) (48/50) 19/08/13 19:53:27 INFO TaskSetManager: Finished task 47.0 in stage 0.0 (TID 47) in 15 ms on datanode03 (executor 1) (49/50) 19/08/13 19:53:27 INFO TaskSetManager: Finished task 49.0 in stage 0.0 (TID 49) in 25 ms on datanode01 (executor 2) (50/50) 19/08/13 19:53:27 INFO YarnClusterScheduler: Removed TaskSet 0.0, whose tasks have all completed, from pool 19/08/13 19:53:27 INFO DAGScheduler: ShuffleMapStage 0 (start at VoiceApplication2.java:128) finished in 1.174 s 19/08/13 19:53:27 INFO DAGScheduler: looking for newly runnable stages 19/08/13 19:53:27 INFO DAGScheduler: running: Set() 19/08/13 19:53:27 INFO DAGScheduler: waiting: Set(ResultStage 1) 19/08/13 19:53:27 INFO DAGScheduler: failed: Set() 19/08/13 19:53:27 INFO DAGScheduler: Submitting ResultStage 1 (ShuffledRDD[2] at start at VoiceApplication2.java:128), which has no missing parents 19/08/13 19:53:27 INFO MemoryStore: Block broadcast_1 stored as values in memory (estimated size 3.2 KB, free 366.3 MB) 19/08/13 19:53:27 INFO MemoryStore: Block broadcast_1_piece0 stored as bytes in memory (estimated size 1979.0 B, free 366.3 MB) 19/08/13 19:53:27 INFO BlockManagerInfo: Added broadcast_1_piece0 in memory on datanode02:31984 (size: 1979.0 B, free: 366.3 MB) 19/08/13 19:53:27 INFO SparkContext: Created broadcast 1 from broadcast at DAGScheduler.scala:1039 19/08/13 19:53:27 INFO DAGScheduler: Submitting 20 missing tasks from ResultStage 1 (ShuffledRDD[2] at start at VoiceApplication2.java:128) (first 15 tasks are for partitions Vector(0, 1, 2, 3, 4, 5, 6, 7, 8, 9, 10, 11, 12, 13, 14)) 19/08/13 19:53:27 INFO YarnClusterScheduler: Adding task set 1.0 with 20 tasks 19/08/13 19:53:27 INFO TaskSetManager: Starting task 0.0 in stage 1.0 (TID 50, datanode03, executor 1, partition 0, NODE_LOCAL, 7638 bytes) 19/08/13 19:53:27 INFO TaskSetManager: Starting task 1.0 in stage 1.0 (TID 51, datanode02, executor 3, partition 1, NODE_LOCAL, 7638 bytes) 19/08/13 19:53:27 INFO TaskSetManager: Starting task 3.0 in stage 1.0 (TID 52, datanode01, executor 2, partition 3, NODE_LOCAL, 7638 bytes) 19/08/13 19:53:27 INFO TaskSetManager: Starting task 2.0 in stage 1.0 (TID 53, datanode03, executor 1, partition 2, NODE_LOCAL, 7638 bytes) 19/08/13 19:53:27 INFO TaskSetManager: Starting task 4.0 in stage 1.0 (TID 54, datanode02, executor 3, partition 4, NODE_LOCAL, 7638 bytes) 19/08/13 19:53:27 INFO TaskSetManager: Starting task 5.0 in stage 1.0 (TID 55, datanode01, executor 2, partition 5, NODE_LOCAL, 7638 bytes) 19/08/13 19:53:27 INFO BlockManagerInfo: Added broadcast_1_piece0 in memory on datanode02:28863 (size: 1979.0 B, free: 2.5 GB) 19/08/13 19:53:27 INFO BlockManagerInfo: Added broadcast_1_piece0 in memory on datanode01:20487 (size: 1979.0 B, free: 2.5 GB) 19/08/13 19:53:27 INFO BlockManagerInfo: Added broadcast_1_piece0 in memory on datanode03:3328 (size: 1979.0 B, free: 2.5 GB) 19/08/13 19:53:27 INFO MapOutputTrackerMasterEndpoint: Asked to send map output locations for shuffle 0 to 10.1.229.163:24656 19/08/13 19:53:27 INFO MapOutputTrackerMasterEndpoint: Asked to send map output locations for shuffle 0 to 10.1.198.144:41122 19/08/13 19:53:27 INFO MapOutputTrackerMasterEndpoint: Asked to send map output locations for shuffle 0 to 10.1.229.158:64276 19/08/13 19:53:27 INFO TaskSetManager: Starting task 7.0 in stage 1.0 (TID 56, datanode03, executor 1, partition 7, NODE_LOCAL, 7638 bytes) 19/08/13 19:53:27 INFO TaskSetManager: Finished task 2.0 in stage 1.0 (TID 53) in 192 ms on datanode03 (executor 1) (1/20) 19/08/13 19:53:27 INFO TaskSetManager: Starting task 8.0 in stage 1.0 (TID 57, datanode03, executor 1, partition 8, NODE_LOCAL, 7638 bytes) 19/08/13 19:53:27 INFO TaskSetManager: Finished task 7.0 in stage 1.0 (TID 56) in 25 ms on datanode03 (executor 1) (2/20) 19/08/13 19:53:27 INFO TaskSetManager: Starting task 6.0 in stage 1.0 (TID 58, datanode02, executor 3, partition 6, NODE_LOCAL, 7638 bytes) 19/08/13 19:53:27 INFO TaskSetManager: Finished task 1.0 in stage 1.0 (TID 51) in 220 ms on datanode02 (executor 3) (3/20) 19/08/13 19:53:27 INFO TaskSetManager: Starting task 14.0 in stage 1.0 (TID 59, datanode03, executor 1, partition 14, NODE_LOCAL, 7638 bytes) 19/08/13 19:53:27 INFO TaskSetManager: Finished task 8.0 in stage 1.0 (TID 57) in 17 ms on datanode03 (executor 1) (4/20) 19/08/13 19:53:27 INFO TaskSetManager: Starting task 16.0 in stage 1.0 (TID 60, datanode03, executor 1, partition 16, NODE_LOCAL, 7638 bytes) 19/08/13 19:53:27 INFO TaskSetManager: Finished task 14.0 in stage 1.0 (TID 59) in 15 ms on datanode03 (executor 1) (5/20) 19/08/13 19:53:27 INFO TaskSetManager: Finished task 16.0 in stage 1.0 (TID 60) in 21 ms on datanode03 (executor 1) (6/20) 19/08/13 19:53:27 INFO TaskSetManager: Starting task 9.0 in stage 1.0 (TID 61, datanode02, executor 3, partition 9, NODE_LOCAL, 7638 bytes) 19/08/13 19:53:27 INFO TaskSetManager: Finished task 4.0 in stage 1.0 (TID 54) in 269 ms on datanode02 (executor 3) (7/20) 19/08/13 19:53:27 INFO TaskSetManager: Finished task 0.0 in stage 1.0 (TID 50) in 339 ms on datanode03 (executor 1) (8/20) 19/08/13 19:53:27 INFO TaskSetManager: Starting task 10.0 in stage 1.0 (TID 62, datanode02, executor 3, partition 10, NODE_LOCAL, 7638 bytes) 19/08/13 19:53:27 INFO TaskSetManager: Finished task 6.0 in stage 1.0 (TID 58) in 56 ms on datanode02 (executor 3) (9/20) 19/08/13 19:53:27 INFO TaskSetManager: Starting task 11.0 in stage 1.0 (TID 63, datanode01, executor 2, partition 11, NODE_LOCAL, 7638 bytes) 19/08/13 19:53:27 INFO TaskSetManager: Finished task 5.0 in stage 1.0 (TID 55) in 284 ms on datanode01 (executor 2) (10/20) 19/08/13 19:53:27 INFO TaskSetManager: Starting task 12.0 in stage 1.0 (TID 64, datanode01, executor 2, partition 12, NODE_LOCAL, 7638 bytes) 19/08/13 19:53:27 INFO TaskSetManager: Finished task 3.0 in stage 1.0 (TID 52) in 287 ms on datanode01 (executor 2) (11/20) 19/08/13 19:53:27 INFO TaskSetManager: Starting task 13.0 in stage 1.0 (TID 65, datanode02, executor 3, partition 13, NODE_LOCAL, 7638 bytes) 19/08/13 19:53:27 INFO TaskSetManager: Starting task 15.0 in stage 1.0 (TID 66, datanode02, executor 3, partition 15, NODE_LOCAL, 7638 bytes) 19/08/13 19:53:27 INFO TaskSetManager: Finished task 10.0 in stage 1.0 (TID 62) in 25 ms on datanode02 (executor 3) (12/20) 19/08/13 19:53:27 INFO TaskSetManager: Finished task 9.0 in stage 1.0 (TID 61) in 29 ms on datanode02 (executor 3) (13/20) 19/08/13 19:53:27 INFO TaskSetManager: Starting task 17.0 in stage 1.0 (TID 67, datanode02, executor 3, partition 17, NODE_LOCAL, 7638 bytes) 19/08/13 19:53:27 INFO TaskSetManager: Finished task 15.0 in stage 1.0 (TID 66) in 13 ms on datanode02 (executor 3) (14/20) 19/08/13 19:53:27 INFO TaskSetManager: Finished task 13.0 in stage 1.0 (TID 65) in 16 ms on datanode02 (executor 3) (15/20) 19/08/13 19:53:27 INFO TaskSetManager: Starting task 18.0 in stage 1.0 (TID 68, datanode02, executor 3, partition 18, NODE_LOCAL, 7638 bytes) 19/08/13 19:53:27 INFO TaskSetManager: Starting task 19.0 in stage 1.0 (TID 69, datanode01, executor 2, partition 19, NODE_LOCAL, 7638 bytes) 19/08/13 19:53:27 INFO TaskSetManager: Finished task 11.0 in stage 1.0 (TID 63) in 30 ms on datanode01 (executor 2) (16/20) 19/08/13 19:53:27 INFO TaskSetManager: Finished task 12.0 in stage 1.0 (TID 64) in 30 ms on datanode01 (executor 2) (17/20) 19/08/13 19:53:27 INFO TaskSetManager: Finished task 17.0 in stage 1.0 (TID 67) in 17 ms on datanode02 (executor 3) (18/20) 19/08/13 19:53:27 INFO TaskSetManager: Finished task 19.0 in stage 1.0 (TID 69) in 13 ms on datanode01 (executor 2) (19/20) 19/08/13 19:53:27 INFO TaskSetManager: Finished task 18.0 in stage 1.0 (TID 68) in 20 ms on datanode02 (executor 3) (20/20) 19/08/13 19:53:27 INFO YarnClusterScheduler: Removed TaskSet 1.0, whose tasks have all completed, from pool 19/08/13 19:53:27 INFO DAGScheduler: ResultStage 1 (start at VoiceApplication2.java:128) finished in 0.406 s 19/08/13 19:53:27 INFO DAGScheduler: Job 0 finished: start at VoiceApplication2.java:128, took 1.850883 s 19/08/13 19:53:27 INFO ReceiverTracker: Starting 1 receivers 19/08/13 19:53:27 INFO ReceiverTracker: ReceiverTracker started 19/08/13 19:53:27 INFO KafkaInputDStream: Slide time = 60000 ms 19/08/13 19:53:27 INFO KafkaInputDStream: Storage level = Serialized 1x Replicated 19/08/13 19:53:27 INFO KafkaInputDStream: Checkpoint interval = null 19/08/13 19:53:27 INFO KafkaInputDStream: Remember interval = 60000 ms 19/08/13 19:53:27 INFO KafkaInputDStream: Initialized and validated org.apache.spark.streaming.kafka.KafkaInputDStream@5fd3dc81 19/08/13 19:53:27 INFO ForEachDStream: Slide time = 60000 ms 19/08/13 19:53:27 INFO ForEachDStream: Storage level = Serialized 1x Replicated 19/08/13 19:53:27 INFO ForEachDStream: Checkpoint interval = null 19/08/13 19:53:27 INFO ForEachDStream: Remember interval = 60000 ms 19/08/13 19:53:27 INFO ForEachDStream: Initialized and validated org.apache.spark.streaming.dstream.ForEachDStream@4044ec97 19/08/13 19:53:27 INFO KafkaInputDStream: Slide time = 60000 ms 19/08/13 19:53:27 INFO KafkaInputDStream: Storage level = Serialized 1x Replicated 19/08/13 19:53:27 INFO KafkaInputDStream: Checkpoint interval = null 19/08/13 19:53:27 INFO KafkaInputDStream: Remember interval = 60000 ms 19/08/13 19:53:27 INFO KafkaInputDStream: Initialized and validated org.apache.spark.streaming.kafka.KafkaInputDStream@5fd3dc81 19/08/13 19:53:27 INFO MappedDStream: Slide time = 60000 ms 19/08/13 19:53:27 INFO MappedDStream: Storage level = Serialized 1x Replicated 19/08/13 19:53:27 INFO MappedDStream: Checkpoint interval = null 19/08/13 19:53:27 INFO MappedDStream: Remember interval = 60000 ms 19/08/13 19:53:27 INFO MappedDStream: Initialized and validated org.apache.spark.streaming.dstream.MappedDStream@5dd4b960 19/08/13 19:53:27 INFO ForEachDStream: Slide time = 60000 ms 19/08/13 19:53:27 INFO ForEachDStream: Storage level = Serialized 1x Replicated 19/08/13 19:53:27 INFO ForEachDStream: Checkpoint interval = null 19/08/13 19:53:27 INFO ForEachDStream: Remember interval = 60000 ms 19/08/13 19:53:27 INFO ForEachDStream: Initialized and validated org.apache.spark.streaming.dstream.ForEachDStream@132d0c3c 19/08/13 19:53:27 INFO KafkaInputDStream: Slide time = 60000 ms 19/08/13 19:53:27 INFO KafkaInputDStream: Storage level = Serialized 1x Replicated 19/08/13 19:53:27 INFO KafkaInputDStream: Checkpoint interval = null 19/08/13 19:53:27 INFO KafkaInputDStream: Remember interval = 60000 ms 19/08/13 19:53:27 INFO KafkaInputDStream: Initialized and validated org.apache.spark.streaming.kafka.KafkaInputDStream@5fd3dc81 19/08/13 19:53:27 INFO MappedDStream: Slide time = 60000 ms 19/08/13 19:53:27 INFO MappedDStream: Storage level = Serialized 1x Replicated 19/08/13 19:53:27 INFO MappedDStream: Checkpoint interval = null 19/08/13 19:53:27 INFO MappedDStream: Remember interval = 60000 ms 19/08/13 19:53:27 INFO MappedDStream: Initialized and validated org.apache.spark.streaming.dstream.MappedDStream@5dd4b960 19/08/13 19:53:27 INFO ForEachDStream: Slide time = 60000 ms 19/08/13 19:53:27 INFO ForEachDStream: Storage level = Serialized 1x Replicated 19/08/13 19:53:27 INFO ForEachDStream: Checkpoint interval = null 19/08/13 19:53:27 INFO ForEachDStream: Remember interval = 60000 ms 19/08/13 19:53:27 INFO ForEachDStream: Initialized and validated org.apache.spark.streaming.dstream.ForEachDStream@525bed0c 19/08/13 19:53:27 INFO DAGScheduler: Got job 1 (start at VoiceApplication2.java:128) with 1 output partitions 19/08/13 19:53:27 INFO DAGScheduler: Final stage: ResultStage 2 (start at VoiceApplication2.java:128) 19/08/13 19:53:27 INFO DAGScheduler: Parents of final stage: List() 19/08/13 19:53:27 INFO DAGScheduler: Missing parents: List() 19/08/13 19:53:27 INFO DAGScheduler: Submitting ResultStage 2 (Receiver 0 ParallelCollectionRDD[3] at makeRDD at ReceiverTracker.scala:613), which has no missing parents 19/08/13 19:53:27 INFO ReceiverTracker: Receiver 0 started 19/08/13 19:53:27 INFO MemoryStore: Block broadcast_2 stored as values in memory (estimated size 133.5 KB, free 366.2 MB) 19/08/13 19:53:27 INFO MemoryStore: Block broadcast_2_piece0 stored as bytes in memory (estimated size 36.3 KB, free 366.1 MB) 19/08/13 19:53:27 INFO BlockManagerInfo: Added broadcast_2_piece0 in memory on datanode02:31984 (size: 36.3 KB, free: 366.3 MB) 19/08/13 19:53:27 INFO SparkContext: Created broadcast 2 from broadcast at DAGScheduler.scala:1039 19/08/13 19:53:27 INFO DAGScheduler: Submitting 1 missing tasks from ResultStage 2 (Receiver 0 ParallelCollectionRDD[3] at makeRDD at ReceiverTracker.scala:613) (first 15 tasks are for partitions Vector(0)) 19/08/13 19:53:27 INFO YarnClusterScheduler: Adding task set 2.0 with 1 tasks 19/08/13 19:53:27 INFO TaskSetManager: Starting task 0.0 in stage 2.0 (TID 70, datanode01, executor 2, partition 0, PROCESS_LOCAL, 8757 bytes) 19/08/13 19:53:27 INFO RecurringTimer: Started timer for JobGenerator at time 1565697240000 19/08/13 19:53:27 INFO JobGenerator: Started JobGenerator at 1565697240000 ms 19/08/13 19:53:27 INFO JobScheduler: Started JobScheduler 19/08/13 19:53:27 INFO StreamingContext: StreamingContext started 19/08/13 19:53:27 INFO BlockManagerInfo: Added broadcast_2_piece0 in memory on datanode01:20487 (size: 36.3 KB, free: 2.5 GB) 19/08/13 19:53:27 INFO ReceiverTracker: Registered receiver for stream 0 from 10.1.229.158:64276 19/08/13 19:54:00 INFO JobScheduler: Added jobs for time 1565697240000 ms 19/08/13 19:54:00 INFO JobScheduler: Starting job streaming job 1565697240000 ms.0 from job set of time 1565697240000 ms 19/08/13 19:54:00 INFO JobScheduler: Starting job streaming job 1565697240000 ms.1 from job set of time 1565697240000 ms 19/08/13 19:54:00 INFO JobScheduler: Finished job streaming job 1565697240000 ms.1 from job set of time 1565697240000 ms 19/08/13 19:54:00 INFO JobScheduler: Finished job streaming job 1565697240000 ms.0 from job set of time 1565697240000 ms 19/08/13 19:54:00 INFO JobScheduler: Starting job streaming job 1565697240000 ms.2 from job set of time 1565697240000 ms 19/08/13 19:54:00 INFO SharedState: loading hive config file: file:/data01/hadoop/yarn/local/usercache/hdfs/filecache/85431/__spark_conf__.zip/__hadoop_conf__/hive-site.xml 19/08/13 19:54:00 INFO SharedState: Setting hive.metastore.warehouse.dir ('null') to the value of spark.sql.warehouse.dir ('hdfs://CID-042fb939-95b4-4b74-91b8-9f94b999bdf7/apps/hive/warehouse'). 19/08/13 19:54:00 INFO SharedState: Warehouse path is 'hdfs://CID-042fb939-95b4-4b74-91b8-9f94b999bdf7/apps/hive/warehouse'. 19/08/13 19:54:00 INFO StateStoreCoordinatorRef: Registered StateStoreCoordinator endpoint 19/08/13 19:54:00 INFO BlockManagerInfo: Removed broadcast_1_piece0 on datanode02:31984 in memory (size: 1979.0 B, free: 366.3 MB) 19/08/13 19:54:00 INFO BlockManagerInfo: Removed broadcast_1_piece0 on datanode02:28863 in memory (size: 1979.0 B, free: 2.5 GB) 19/08/13 19:54:00 INFO BlockManagerInfo: Removed broadcast_1_piece0 on datanode01:20487 in memory (size: 1979.0 B, free: 2.5 GB) 19/08/13 19:54:00 INFO BlockManagerInfo: Removed broadcast_1_piece0 on datanode03:3328 in memory (size: 1979.0 B, free: 2.5 GB) 19/08/13 19:54:02 INFO CodeGenerator: Code generated in 175.416957 ms 19/08/13 19:54:02 INFO JobScheduler: Finished job streaming job 1565697240000 ms.2 from job set of time 1565697240000 ms 19/08/13 19:54:02 ERROR JobScheduler: Error running job streaming job 1565697240000 ms.2 org.apache.spark.sql.catalyst.analysis.NoSuchDatabaseException: Database 'meta_voice' not found; at org.apache.spark.sql.catalyst.catalog.ExternalCatalog.requireDbExists(ExternalCatalog.scala:40) at org.apache.spark.sql.catalyst.catalog.InMemoryCatalog.tableExists(InMemoryCatalog.scala:331) at org.apache.spark.sql.catalyst.catalog.SessionCatalog.tableExists(SessionCatalog.scala:388) at org.apache.spark.sql.DataFrameWriter.saveAsTable(DataFrameWriter.scala:398) at org.apache.spark.sql.DataFrameWriter.saveAsTable(DataFrameWriter.scala:393) at com.stream.VoiceApplication2$2.call(VoiceApplication2.java:122) at com.stream.VoiceApplication2$2.call(VoiceApplication2.java:115) at org.apache.spark.streaming.api.java.JavaDStreamLike$$anonfun$foreachRDD$2.apply(JavaDStreamLike.scala:280) at org.apache.spark.streaming.api.java.JavaDStreamLike$$anonfun$foreachRDD$2.apply(JavaDStreamLike.scala:280) at org.apache.spark.streaming.dstream.ForEachDStream$$anonfun$1$$anonfun$apply$mcV$sp$1.apply$mcV$sp(ForEachDStream.scala:51) at org.apache.spark.streaming.dstream.ForEachDStream$$anonfun$1$$anonfun$apply$mcV$sp$1.apply(ForEachDStream.scala:51) at org.apache.spark.streaming.dstream.ForEachDStream$$anonfun$1$$anonfun$apply$mcV$sp$1.apply(ForEachDStream.scala:51) at org.apache.spark.streaming.dstream.DStream.createRDDWithLocalProperties(DStream.scala:416) at org.apache.spark.streaming.dstream.ForEachDStream$$anonfun$1.apply$mcV$sp(ForEachDStream.scala:50) at org.apache.spark.streaming.dstream.ForEachDStream$$anonfun$1.apply(ForEachDStream.scala:50) at org.apache.spark.streaming.dstream.ForEachDStream$$anonfun$1.apply(ForEachDStream.scala:50) at scala.util.Try$.apply(Try.scala:192) at org.apache.spark.streaming.scheduler.Job.run(Job.scala:39) at org.apache.spark.streaming.scheduler.JobScheduler$JobHandler$$anonfun$run$1.apply$mcV$sp(JobScheduler.scala:257) at org.apache.spark.streaming.scheduler.JobScheduler$JobHandler$$anonfun$run$1.apply(JobScheduler.scala:257) at org.apache.spark.streaming.scheduler.JobScheduler$JobHandler$$anonfun$run$1.apply(JobScheduler.scala:257) at scala.util.DynamicVariable.withValue(DynamicVariable.scala:58) at org.apache.spark.streaming.scheduler.JobScheduler$JobHandler.run(JobScheduler.scala:256) at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1149) at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:624) at java.lang.Thread.run(Thread.java:748) 19/08/13 19:54:02 ERROR ApplicationMaster: User class threw exception: org.apache.spark.sql.catalyst.analysis.NoSuchDatabaseException: Database 'meta_voice' not found; org.apache.spark.sql.catalyst.analysis.NoSuchDatabaseException: Database 'meta_voice' not found; at org.apache.spark.sql.catalyst.catalog.ExternalCatalog.requireDbExists(ExternalCatalog.scala:40) at org.apache.spark.sql.catalyst.catalog.InMemoryCatalog.tableExists(InMemoryCatalog.scala:331) at org.apache.spark.sql.catalyst.catalog.SessionCatalog.tableExists(SessionCatalog.scala:388) at org.apache.spark.sql.DataFrameWriter.saveAsTable(DataFrameWriter.scala:398) at org.apache.spark.sql.DataFrameWriter.saveAsTable(DataFrameWriter.scala:393) at com.stream.VoiceApplication2$2.call(VoiceApplication2.java:122) at com.stream.VoiceApplication2$2.call(VoiceApplication2.java:115) at org.apache.spark.streaming.api.java.JavaDStreamLike$$anonfun$foreachRDD$2.apply(JavaDStreamLike.scala:280) at org.apache.spark.streaming.api.java.JavaDStreamLike$$anonfun$foreachRDD$2.apply(JavaDStreamLike.scala:280) at org.apache.spark.streaming.dstream.ForEachDStream$$anonfun$1$$anonfun$apply$mcV$sp$1.apply$mcV$sp(ForEachDStream.scala:51) at org.apache.spark.streaming.dstream.ForEachDStream$$anonfun$1$$anonfun$apply$mcV$sp$1.apply(ForEachDStream.scala:51) at org.apache.spark.streaming.dstream.ForEachDStream$$anonfun$1$$anonfun$apply$mcV$sp$1.apply(ForEachDStream.scala:51) at org.apache.spark.streaming.dstream.DStream.createRDDWithLocalProperties(DStream.scala:416) at org.apache.spark.streaming.dstream.ForEachDStream$$anonfun$1.apply$mcV$sp(ForEachDStream.scala:50) at org.apache.spark.streaming.dstream.ForEachDStream$$anonfun$1.apply(ForEachDStream.scala:50) at org.apache.spark.streaming.dstream.ForEachDStream$$anonfun$1.apply(ForEachDStream.scala:50) at scala.util.Try$.apply(Try.scala:192) at org.apache.spark.streaming.scheduler.Job.run(Job.scala:39) at org.apache.spark.streaming.scheduler.JobScheduler$JobHandler$$anonfun$run$1.apply$mcV$sp(JobScheduler.scala:257) at org.apache.spark.streaming.scheduler.JobScheduler$JobHandler$$anonfun$run$1.apply(JobScheduler.scala:257) at org.apache.spark.streaming.scheduler.JobScheduler$JobHandler$$anonfun$run$1.apply(JobScheduler.scala:257) at scala.util.DynamicVariable.withValue(DynamicVariable.scala:58) at org.apache.spark.streaming.scheduler.JobScheduler$JobHandler.run(JobScheduler.scala:256) at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1149) at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:624) at java.lang.Thread.run(Thread.java:748) 19/08/13 19:54:02 INFO ApplicationMaster: Final app status: FAILED, exitCode: 15, (reason: User class threw exception: org.apache.spark.sql.catalyst.analysis.NoSuchDatabaseException: Database 'meta_voice' not found; at org.apache.spark.sql.catalyst.catalog.ExternalCatalog.requireDbExists(ExternalCatalog.scala:40) at org.apache.spark.sql.catalyst.catalog.InMemoryCatalog.tableExists(InMemoryCatalog.scala:331) at org.apache.spark.sql.catalyst.catalog.SessionCatalog.tableExists(SessionCatalog.scala:388) at org.apache.spark.sql.DataFrameWriter.saveAsTable(DataFrameWriter.scala:398) at org.apache.spark.sql.DataFrameWriter.saveAsTable(DataFrameWriter.scala:393) at com.stream.VoiceApplication2$2.call(VoiceApplication2.java:122) at com.stream.VoiceApplication2$2.call(VoiceApplication2.java:115) at org.apache.spark.streaming.api.java.JavaDStreamLike$$anonfun$foreachRDD$2.apply(JavaDStreamLike.scala:280) at org.apache.spark.streaming.api.java.JavaDStreamLike$$anonfun$foreachRDD$2.apply(JavaDStreamLike.scala:280) at org.apache.spark.streaming.dstream.ForEachDStream$$anonfun$1$$anonfun$apply$mcV$sp$1.apply$mcV$sp(ForEachDStream.scala:51) at org.apache.spark.streaming.dstream.ForEachDStream$$anonfun$1$$anonfun$apply$mcV$sp$1.apply(ForEachDStream.scala:51) at org.apache.spark.streaming.dstream.ForEachDStream$$anonfun$1$$anonfun$apply$mcV$sp$1.apply(ForEachDStream.scala:51) at org.apache.spark.streaming.dstream.DStream.createRDDWithLocalProperties(DStream.scala:416) at org.apache.spark.streaming.dstream.ForEachDStream$$anonfun$1.apply$mcV$sp(ForEachDStream.scala:50) at org.apache.spark.streaming.dstream.ForEachDStream$$anonfun$1.apply(ForEachDStream.scala:50) at org.apache.spark.streaming.dstream.ForEachDStream$$anonfun$1.apply(ForEachDStream.scala:50) at scala.util.Try$.apply(Try.scala:192) at org.apache.spark.streaming.scheduler.Job.run(Job.scala:39) at org.apache.spark.streaming.scheduler.JobScheduler$JobHandler$$anonfun$run$1.apply$mcV$sp(JobScheduler.scala:257) at org.apache.spark.streaming.scheduler.JobScheduler$JobHandler$$anonfun$run$1.apply(JobScheduler.scala:257) at org.apache.spark.streaming.scheduler.JobScheduler$JobHandler$$anonfun$run$1.apply(JobScheduler.scala:257) at scala.util.DynamicVariable.withValue(DynamicVariable.scala:58) at org.apache.spark.streaming.scheduler.JobScheduler$JobHandler.run(JobScheduler.scala:256) at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1149) at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:624) at java.lang.Thread.run(Thread.java:748) ) 19/08/13 19:54:02 INFO StreamingContext: Invoking stop(stopGracefully=true) from shutdown hook 19/08/13 19:54:02 INFO ReceiverTracker: Sent stop signal to all 1 receivers 19/08/13 19:54:02 ERROR ReceiverTracker: Deregistered receiver for stream 0: Stopped by driver 19/08/13 19:54:02 INFO TaskSetManager: Finished task 0.0 in stage 2.0 (TID 70) in 35055 ms on datanode01 (executor 2) (1/1) 19/08/13 19:54:02 INFO YarnClusterScheduler: Removed TaskSet 2.0, whose tasks have all completed, from pool 19/08/13 19:54:02 INFO DAGScheduler: ResultStage 2 (start at VoiceApplication2.java:128) finished in 35.086 s 19/08/13 19:54:02 INFO ReceiverTracker: Waiting for receiver job to terminate gracefully 19/08/13 19:54:02 INFO ReceiverTracker: Waited for receiver job to terminate gracefully 19/08/13 19:54:02 INFO ReceiverTracker: All of the receivers have deregistered successfully 19/08/13 19:54:02 INFO ReceiverTracker: ReceiverTracker stopped 19/08/13 19:54:02 INFO JobGenerator: Stopping JobGenerator gracefully 19/08/13 19:54:02 INFO JobGenerator: Waiting for all received blocks to be consumed for job generation 19/08/13 19:54:02 INFO JobGenerator: Waited for all received blocks to be consumed for job generation 19/08/13 19:54:12 WARN ShutdownHookManager: ShutdownHook '$anon$2' timeout, java.util.concurrent.TimeoutException java.util.concurrent.TimeoutException at java.util.concurrent.FutureTask.get(FutureTask.java:205) at org.apache.hadoop.util.ShutdownHookManager$1.run(ShutdownHookManager.java:67) 19/08/13 19:54:12 ERROR Utils: Uncaught exception in thread pool-1-thread-1 java.lang.InterruptedException at java.lang.Object.wait(Native Method) at java.lang.Thread.join(Thread.java:1252) at java.lang.Thread.join(Thread.java:1326) at org.apache.spark.streaming.util.RecurringTimer.stop(RecurringTimer.scala:86) at org.apache.spark.streaming.scheduler.JobGenerator.stop(JobGenerator.scala:137) at org.apache.spark.streaming.scheduler.JobScheduler.stop(JobScheduler.scala:123) at org.apache.spark.streaming.StreamingContext$$anonfun$stop$1.apply$mcV$sp(StreamingContext.scala:681) at org.apache.spark.util.Utils$.tryLogNonFatalError(Utils.scala:1357) at org.apache.spark.streaming.StreamingContext.stop(StreamingContext.scala:680) at org.apache.spark.streaming.StreamingContext.org$apache$spark$streaming$StreamingContext$$stopOnShutdown(StreamingContext.scala:714) at org.apache.spark.streaming.StreamingContext$$anonfun$start$1.apply$mcV$sp(StreamingContext.scala:599) at org.apache.spark.util.SparkShutdownHook.run(ShutdownHookManager.scala:216) at org.apache.spark.util.SparkShutdownHookManager$$anonfun$runAll$1$$anonfun$apply$mcV$sp$1.apply$mcV$sp(ShutdownHookManager.scala:188) at org.apache.spark.util.SparkShutdownHookManager$$anonfun$runAll$1$$anonfun$apply$mcV$sp$1.apply(ShutdownHookManager.scala:188) at org.apache.spark.util.SparkShutdownHookManager$$anonfun$runAll$1$$anonfun$apply$mcV$sp$1.apply(ShutdownHookManager.scala:188) at org.apache.spark.util.Utils$.logUncaughtExceptions(Utils.scala:1988) at org.apache.spark.util.SparkShutdownHookManager$$anonfun$runAll$1.apply$mcV$sp(ShutdownHookManager.scala:188) at org.apache.spark.util.SparkShutdownHookManager$$anonfun$runAll$1.apply(ShutdownHookManager.scala:188) at org.apache.spark.util.SparkShutdownHookManager$$anonfun$runAll$1.apply(ShutdownHookManager.scala:188) at scala.util.Try$.apply(Try.scala:192) at org.apache.spark.util.SparkShutdownHookManager.runAll(ShutdownHookManager.scala:188) at org.apache.spark.util.SparkShutdownHookManager$$anon$2.run(ShutdownHookManager.scala:178) at java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:511) at java.util.concurrent.FutureTask.run(FutureTask.java:266) at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1149) at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:624) at java.lang.Thread.run(Thread.java:748) ```

Spark1.3基于scala2.11编译hive-thrift报错,关于jline的

[INFO] [INFO] ------------------------------------------------------------------------ [INFO] Building Spark Project Hive Thrift Server 1.3.0 [INFO] ------------------------------------------------------------------------ [INFO] [INFO] --- maven-clean-plugin:2.5:clean (default-clean) @ spark-hive-thriftserver_2.11 --- [INFO] Deleting /usr/local/spark-1.3.0/sql/hive-thriftserver/target [INFO] [INFO] --- maven-enforcer-plugin:1.3.1:enforce (enforce-versions) @ spark-hive-thriftserver_2.11 --- [INFO] [INFO] --- scala-maven-plugin:3.2.0:add-source (eclipse-add-source) @ spark-hive-thriftserver_2.11 --- [INFO] Add Source directory: /usr/local/spark-1.3.0/sql/hive-thriftserver/src/main/scala [INFO] Add Test Source directory: /usr/local/spark-1.3.0/sql/hive-thriftserver/src/test/scala [INFO] [INFO] --- build-helper-maven-plugin:1.8:add-source (add-scala-sources) @ spark-hive-thriftserver_2.11 --- [INFO] Source directory: /usr/local/spark-1.3.0/sql/hive-thriftserver/src/main/scala added. [INFO] [INFO] --- build-helper-maven-plugin:1.8:add-source (add-default-sources) @ spark-hive-thriftserver_2.11 --- [INFO] Source directory: /usr/local/spark-1.3.0/sql/hive-thriftserver/v0.13.1/src/main/scala added. [INFO] [INFO] --- maven-remote-resources-plugin:1.5:process (default) @ spark-hive-thriftserver_2.11 --- [INFO] [INFO] --- maven-resources-plugin:2.6:resources (default-resources) @ spark-hive-thriftserver_2.11 --- [INFO] Using 'UTF-8' encoding to copy filtered resources. [INFO] skip non existing resourceDirectory /usr/local/spark-1.3.0/sql/hive-thriftserver/src/main/resources [INFO] Copying 3 resources [INFO] [INFO] --- scala-maven-plugin:3.2.0:compile (scala-compile-first) @ spark-hive-thriftserver_2.11 --- [WARNING] Zinc server is not available at port 3030 - reverting to normal incremental compile [INFO] Using incremental compilation [INFO] compiler plugin: BasicArtifact(org.scalamacros,paradise_2.11.2,2.0.1,null) [INFO] Compiling 9 Scala sources to /usr/local/spark-1.3.0/sql/hive-thriftserver/target/scala-2.11/classes... [ERROR] /usr/local/spark-1.3.0/sql/hive-thriftserver/src/main/scala/org/apache/spark/sql/hive/thriftserver/SparkSQLCLIDriver.scala:25: object ConsoleReader is not a member of package jline [ERROR] import jline.{ConsoleReader, History} [ERROR] ^ [WARNING] Class jline.Completor not found - continuing with a stub. [WARNING] Class jline.ConsoleReader not found - continuing with a stub. [ERROR] /usr/local/spark-1.3.0/sql/hive-thriftserver/src/main/scala/org/apache/spark/sql/hive/thriftserver/SparkSQLCLIDriver.scala:165: not found: type ConsoleReader [ERROR] val reader = new ConsoleReader() [ERROR] ^ [ERROR] Class jline.Completor not found - continuing with a stub. [WARNING] Class com.google.protobuf.Parser not found - continuing with a stub. [WARNING] Class com.google.protobuf.Parser not found - continuing with a stub. [WARNING] Class com.google.protobuf.Parser not found - continuing with a stub. [WARNING] Class com.google.protobuf.Parser not found - continuing with a stub. [WARNING] 6 warnings found [ERROR] three errors found [INFO] ------------------------------------------------------------------------ [INFO] Reactor Summary: [INFO] [INFO] Spark Project Parent POM ........................... SUCCESS [01:20 min] [INFO] Spark Project Networking ........................... SUCCESS [01:31 min] [INFO] Spark Project Shuffle Streaming Service ............ SUCCESS [ 47.808 s] [INFO] Spark Project Core ................................. SUCCESS [34:00 min] [INFO] Spark Project Bagel ................................ SUCCESS [03:21 min] [INFO] Spark Project GraphX ............................... SUCCESS [09:22 min] [INFO] Spark Project Streaming ............................ SUCCESS [15:07 min] [INFO] Spark Project Catalyst ............................. SUCCESS [14:35 min] [INFO] Spark Project SQL .................................. SUCCESS [16:31 min] [INFO] Spark Project ML Library ........................... SUCCESS [18:15 min] [INFO] Spark Project Tools ................................ SUCCESS [01:50 min] [INFO] Spark Project Hive ................................. SUCCESS [13:58 min] [INFO] Spark Project REPL ................................. SUCCESS [06:13 min] [INFO] Spark Project YARN ................................. SUCCESS [07:05 min] [INFO] Spark Project Hive Thrift Server ................... FAILURE [01:39 min] [INFO] Spark Project Assembly ............................. SKIPPED [INFO] Spark Project External Twitter ..................... SKIPPED [INFO] Spark Project External Flume Sink .................. SKIPPED [INFO] Spark Project External Flume ....................... SKIPPED [INFO] Spark Project External MQTT ........................ SKIPPED [INFO] Spark Project External ZeroMQ ...................... SKIPPED [INFO] Spark Project Examples ............................. SKIPPED [INFO] Spark Project YARN Shuffle Service ................. SKIPPED [INFO] ------------------------------------------------------------------------ [INFO] BUILD FAILURE [INFO] ------------------------------------------------------------------------ [INFO] Total time: 02:25 h [INFO] Finished at: 2015-04-16T14:11:24+08:00 [INFO] Final Memory: 62M/362M [INFO] ------------------------------------------------------------------------ [WARNING] The requested profile "hadoop-2.5" could not be activated because it does not exist. [ERROR] Failed to execute goal net.alchim31.maven:scala-maven-plugin:3.2.0:compile (scala-compile-first) on project spark-hive-thriftserver_2.11: Execution scala-compile-first of goal net.alchim31.maven:scala-maven-plugin:3.2.0:compile failed. CompileFailed -> [Help 1] [ERROR] [ERROR] To see the full stack trace of the errors, re-run Maven with the -e switch. [ERROR] Re-run Maven using the -X switch to enable full debug logging. [ERROR] [ERROR] For more information about the errors and possible solutions, please read the following articles: [ERROR] [Help 1] http://cwiki.apache.org/confluence/display/MAVEN/PluginExecutionException [ERROR] [ERROR] After correcting the problems, you can resume the build with the command [ERROR] mvn <goals> -rf :spark-hive-thriftserver_2.11

java 实现 sparksql 时,mysql数据库查询结果只有表头没有数据

这两天尝试用java实现sparksql连接mysql数据库,经过调试可以成功连接到数据库,但奇怪的是只能够查询出表头和表结构却看不到表里面数据 代码如下 import java.util.Hashtable; import java.util.Properties; import javax.swing.JFrame; import org.apache.avro.hadoop.io.AvroKeyValue.Iterator; import org.apache.hadoop.hdfs.protocolPB.DatanodeProtocolServerSideTranslatorPB; import org.apache.hadoop.hive.ql.exec.vector.expressions.IsNull; import org.apache.log4j.Logger; import org.apache.spark.SparkConf; import org.apache.spark.SparkContext; import org.apache.spark.api.java.JavaSparkContext; import org.apache.spark.rdd.RDD; import org.apache.spark.sql.DataFrameReader; import org.apache.spark.sql.DataFrameWriter; import org.apache.spark.sql.Dataset; import org.apache.spark.sql.Row; import org.apache.spark.sql.SQLContext; import org.apache.spark.sql.SaveMode; import org.apache.spark.sql.SparkSession; import org.apache.spark.sql.SparkSession.Builder; import org.apache.spark.sql.jdbc.JdbcDialect; import org.datanucleus.store.rdbms.identifier.IdentifierFactory; import antlr.collections.List; import scala.Enumeration.Val; public class Demo_Mysql3 { private static Logger logger = Logger.getLogger(Demo_Mysql3.class); public static void main(String[] args) { SparkConf sparkConf = new SparkConf(); sparkConf.setAppName("Demo_Mysql3"); sparkConf.setMaster("local[5]"); sparkConf.setSparkHome("F:\\DownLoad\\spark\\spark-2.0.0-bin-hadoop2.7"); sparkConf.set("spark.sql.warehouse.dir","F:\\DownLoad\\spark\\spark-2.0.0-bin-hadoop2.7"); SparkContext sc0=null; try { sc0=new SparkContext(sparkConf); SparkSession sparkSession=new SparkSession(sc0); SQLContext sqlContext = new SQLContext(sparkSession); // 一个条件表示一个分区 String[] predicates = new String[] { "1=1 order by id limit 400000,50000", "1=1 order by id limit 450000,50000", "1=1 order by id limit 500000,50000", "1=1 order by id limit 550000,50000", "1=1 order by id limit 600000,50000" }; String url = "jdbc:mysql://localhost:3306/clone"; String table = "image"; Properties connectionProperties = new Properties(); connectionProperties.setProperty("dbtable", table);// 设置表 connectionProperties.setProperty("user", "root");// 设置用户名 connectionProperties.setProperty("password", "root");// 设置密码 // 读取数据 DataFrameReader jread = sqlContext.read(); //Dataset<Row> jdbcDs=jread.jdbc(url, table, predicates, connectionProperties); sqlContext.read().jdbc(url, table, predicates, connectionProperties).select("*").show(); } catch (Exception e) { logger.error("|main|exception error", e); } finally { if (sc0 != null) { sc0.stop(); } } } } 控制台输出如下: ![图片说明](https://img-ask.csdn.net/upload/201707/22/1500708689_839040.png)

如何用spark将带有union类型的avro消息存入hive

我现在是用spark streaming 读取kafka中的avro消息,反序列化后希望使用spark sql存入hive或者hdfs 但是不管我是将avro转成json还是case class,即使在DStream.print能够正确打印出来,使用spark.foreachRdd进行后续操作时都会报错。 1. 我找了第三方包,将avdl文件转成caseclass,由于avro中存在union(record,record,record)的类型,所以case class是带有shapeless的类型的,在转成spark后会报<none> is not a term的错误 2. 这一次是使用json4s将对象转换成了json,同样,在DStream.print能够正确打印,但是进入spark.foreachRdd后就会报错,no value for 'BAD', 不知道是不是因为不同的json中有一些属性是null,有一些不是null 3. 这一次我直接使用avro的tostring,和不同的是,在DStream中能打印,进入foreachRdd也并没有报错,但是部分json显示是corrupt. 第二和第三的json我把打印出来的字符串放入json文件里,使用spark.read.json读都是完全没问题的,我就有点搞不明白了,各位大神,有办法能够处理吗? 大家如果有这种需求一般是如何做的呢? Code Snippet: ``` def main { val kafkaStream = KafkaUtils.createDirectStream[String, Array[Byte]](ssc, PreferConsistent, Subscribe[String, Array[Byte]](topics, kafkaParams)) println("Schema:" + schema) val stream = deserialize(config, kafkaStream) process(approach, isPrint, stream) ssc.start() ssc.awaitTermination() } def process(approach: Int, isPrint: Boolean, stream: DStream[DeserializedFromKafkaRecord]): Unit = { approach match { /** * Approach 1 avrohugger + avro4s * Convert GenericData into CaseClass */ case 1 => { val mappedStream = stream.map(record => { val format = RecordFormat[A] val ennio = format.from(record.value) ennio }) if (isPrint) { mappedStream.print() }else { mappedStream.foreachRDD(foreachFunc = rdd => { if (!rdd.partitions.isEmpty) { val spark = SparkSession.builder.config(rdd.sparkContext.getConf).getOrCreate() import spark.implicits._ val df = rdd.toDF() df.show() } }) } } /** * Approach 2 avro4s + json4s * Convert CaseClass into Json */ case 2 => { val mappedStream = stream.map(record => { val format = RecordFormat[A] val ennio = format.from(record.value) implicit val formats = DefaultFormats.preservingEmptyValues write(ennio) }) if (isPrint) { mappedStream.print() }else { mappedStream.foreachRDD(foreachFunc = rdd => { if (!rdd.partitions.isEmpty) { val spark = SparkSession.builder.config(rdd.sparkContext.getConf).getOrCreate() import spark.implicits._ val df = spark.read.json(spark.createDataset(rdd)) df.show() } }) } } /** * Approach 3 * Convert GenericData into Json */ case 3 => { val mappedStream = stream.mapPartitions(partition => { partition.map(_.value.toString) }) if (isPrint) { mappedStream.print() }else { mappedStream.foreachRDD(foreachFunc = rdd => { if (!rdd.partitions.isEmpty) { val spark = SparkSession.builder.config(rdd.sparkContext.getConf).getOrCreate() import spark.implicits._ val df = spark.read.json(spark.createDataset(rdd)) df.show() } }) } } ```

spark jdbc连接impala报错Method not supported

各位好 我的spark是2.1.0,用的hive-jdbc 2.1.0,现在写入impala的时候报以下错: java.sql.SQLException: Method not supported at org.apache.hive.jdbc.HivePreparedStatement.addBatch(HivePreparedStatement.java:75) at org.apache.spark.sql.execution.datasources.jdbc.JdbcUtils$.savePartition(JdbcUtils.scala:589) at org.apache.spark.sql.execution.datasources.jdbc.JdbcUtils$$anonfun$saveTable$1.apply(JdbcUtils.scala:670) at org.apache.spark.sql.execution.datasources.jdbc.JdbcUtils$$anonfun$saveTable$1.apply(JdbcUtils.scala:670) at org.apache.spark.rdd.RDD$$anonfun$foreachPartition$1$$anonfun$apply$29.apply(RDD.scala:925) at org.apache.spark.rdd.RDD$$anonfun$foreachPartition$1$$anonfun$apply$29.apply(RDD.scala:925) at org.apache.spark.SparkContext$$anonfun$runJob$5.apply(SparkContext.scala:1944) at org.apache.spark.SparkContext$$anonfun$runJob$5.apply(SparkContext.scala:1944) at org.apache.spark.scheduler.ResultTask.runTask(ResultTask.scala:87) at org.apache.spark.scheduler.Task.run(Task.scala:99) at org.apache.spark.executor.Executor$TaskRunner.run(Executor.scala:322) at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1142) at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:617) at java.lang.Thread.run(Thread.java:745) Driver stacktrace: at org.apache.spark.scheduler.DAGScheduler.org$apache$spark$scheduler$DAGScheduler$$failJobAndIndependentStages(DAGScheduler.scala:1435) at org.apache.spark.scheduler.DAGScheduler$$anonfun$abortStage$1.apply(DAGScheduler.scala:1423) at org.apache.spark.scheduler.DAGScheduler$$anonfun$abortStage$1.apply(DAGScheduler.scala:1422) at scala.collection.mutable.ResizableArray$class.foreach(ResizableArray.scala:59) at scala.collection.mutable.ArrayBuffer.foreach(ArrayBuffer.scala:48) at org.apache.spark.scheduler.DAGScheduler.abortStage(DAGScheduler.scala:1422) at org.apache.spark.scheduler.DAGScheduler$$anonfun$handleTaskSetFailed$1.apply(DAGScheduler.scala:802) at org.apache.spark.scheduler.DAGScheduler$$anonfun$handleTaskSetFailed$1.apply(DAGScheduler.scala:802) at scala.Option.foreach(Option.scala:257) at org.apache.spark.scheduler.DAGScheduler.handleTaskSetFailed(DAGScheduler.scala:802) at org.apache.spark.scheduler.DAGSchedulerEventProcessLoop.doOnReceive(DAGScheduler.scala:1650) at org.apache.spark.scheduler.DAGSchedulerEventProcessLoop.onReceive(DAGScheduler.scala:1605) at org.apache.spark.scheduler.DAGSchedulerEventProcessLoop.onReceive(DAGScheduler.scala:1594) at org.apache.spark.util.EventLoop$$anon$1.run(EventLoop.scala:48) at org.apache.spark.scheduler.DAGScheduler.runJob(DAGScheduler.scala:628) at org.apache.spark.SparkContext.runJob(SparkContext.scala:1918) at org.apache.spark.SparkContext.runJob(SparkContext.scala:1931) at org.apache.spark.SparkContext.runJob(SparkContext.scala:1944) at org.apache.spark.SparkContext.runJob(SparkContext.scala:1958) at org.apache.spark.rdd.RDD$$anonfun$foreachPartition$1.apply(RDD.scala:925) at org.apache.spark.rdd.RDD$$anonfun$foreachPartition$1.apply(RDD.scala:923) at org.apache.spark.rdd.RDDOperationScope$.withScope(RDDOperationScope.scala:151) at org.apache.spark.rdd.RDDOperationScope$.withScope(RDDOperationScope.scala:112) at org.apache.spark.rdd.RDD.withScope(RDD.scala:362) at org.apache.spark.rdd.RDD.foreachPartition(RDD.scala:923) at org.apache.spark.sql.Dataset$$anonfun$foreachPartition$1.apply$mcV$sp(Dataset.scala:2305) at org.apache.spark.sql.Dataset$$anonfun$foreachPartition$1.apply(Dataset.scala:2305) at org.apache.spark.sql.Dataset$$anonfun$foreachPartition$1.apply(Dataset.scala:2305) at org.apache.spark.sql.execution.SQLExecution$.withNewExecutionId(SQLExecution.scala:57) at org.apache.spark.sql.Dataset.withNewExecutionId(Dataset.scala:2765) at org.apache.spark.sql.Dataset.foreachPartition(Dataset.scala:2304) at org.apache.spark.sql.execution.datasources.jdbc.JdbcUtils$.saveTable(JdbcUtils.scala:670) at org.apache.spark.sql.execution.datasources.jdbc.JdbcRelationProvider.createRelation(JdbcRelationProvider.scala:77) at org.apache.spark.sql.execution.datasources.DataSource.write(DataSource.scala:518) at org.apache.spark.sql.DataFrameWriter.save(DataFrameWriter.scala:215) at org.apache.spark.sql.DataFrameWriter.jdbc(DataFrameWriter.scala:446) at com.aoyou.data.CustomerVisitProduct$.saveToHive(CustomerVisitProduct.scala:281) at com.aoyou.data.CustomerVisitProduct$.main(CustomerVisitProduct.scala:221) at com.aoyou.data.CustomerVisitProduct.main(CustomerVisitProduct.scala) at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method) at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62) at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43) at java.lang.reflect.Method.invoke(Method.java:497) at org.apache.spark.deploy.SparkSubmit$.org$apache$spark$deploy$SparkSubmit$$runMain(SparkSubmit.scala:738) at org.apache.spark.deploy.SparkSubmit$.doRunMain$1(SparkSubmit.scala:187) at org.apache.spark.deploy.SparkSubmit$.submit(SparkSubmit.scala:212) at org.apache.spark.deploy.SparkSubmit$.main(SparkSubmit.scala:126) at org.apache.spark.deploy.SparkSubmit.main(SparkSubmit.scala) Caused by: java.sql.SQLException: Method not supported at org.apache.hive.jdbc.HivePreparedStatement.addBatch(HivePreparedStatement.java:75) at org.apache.spark.sql.execution.datasources.jdbc.JdbcUtils$.savePartition(JdbcUtils.scala:589) at org.apache.spark.sql.execution.datasources.jdbc.JdbcUtils$$anonfun$saveTable$1.apply(JdbcUtils.scala:670) at org.apache.spark.sql.execution.datasources.jdbc.JdbcUtils$$anonfun$saveTable$1.apply(JdbcUtils.scala:670) at org.apache.spark.rdd.RDD$$anonfun$foreachPartition$1$$anonfun$apply$29.apply(RDD.scala:925) at org.apache.spark.rdd.RDD$$anonfun$foreachPartition$1$$anonfun$apply$29.apply(RDD.scala:925) at org.apache.spark.SparkContext$$anonfun$runJob$5.apply(SparkContext.scala:1944) at org.apache.spark.SparkContext$$anonfun$runJob$5.apply(SparkContext.scala:1944) at org.apache.spark.scheduler.ResultTask.runTask(ResultTask.scala:87) at org.apache.spark.scheduler.Task.run(Task.scala:99) at org.apache.spark.executor.Executor$TaskRunner.run(Executor.scala:322) at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1142) at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:617) at java.lang.Thread.run(Thread.java:745) 以下是代码实现 val sparkConf = new SparkConf().setAppName("save").set("spark.sql.crossJoin.enabled", "true"); val sparkSession = SparkSession .builder() .enableHiveSupport() .getOrCreate(); val dataframe = sparkSession.createDataFrame(rddSchema, new Row().getClass()) val property = new Properties(); property.put("user", "xxxxx") property.put("password", "xxxxx") dataframe.write.mode(SaveMode.Append).option("driver", "org.apache.hive.jdbc.HiveDriver").jdbc("jdbc:hive2://xxxx:21050/rawdata;auth=noSasl", "tablename", property) 请问这是怎么回事啊?感觉是驱动版本问题

Hive JDBC 连接异常问题

代码: String driverName = "org.apache.hive.jdbc.HiveDriver"; String url = "jdbc:hive://192.168.1.108:10000/default"; String user = ""; String password = ""; String sql = ""; ResultSet res = null; Class.forName(driverName); Connection con = DriverManager.getConnection(url, user, password); 错: Exception in thread "main" java.sql.SQLException: No suitable driver found for jdbc:hive://192.168.1.108:10000/default at java.sql.DriverManager.getConnection(Unknown Source) at java.sql.DriverManager.getConnection(Unknown Source) 所需包都有还包异常。。。求助

spark 出现很严重的数据倾斜,跑批时间很长,有时候会报错

目前在开发一个统计指标的脚本,跑批出现了严重的数据倾斜, ![图片说明](https://img-ask.csdn.net/upload/201902/21/1550735821_877072.jpg) 有时候报java.lang.IllegalArgumentException: Size exceeds Integer.MAX_VALUE,但是分析sql里用到的key,在之前的商户号作为key的时候,差别也是最多的几百条,最少的几条,但是几百条的居多,以商户号和维度类别作为key的时候,分布如下 key条数正数5个 +---+---------------+--------------+ | 条数| merchant_id|statistic_type| +---+---------------+--------------+ | 5| null| 5| | 4|822100047220249| 5| | 4|303300048120004| 5| | 4|303450053310001| 5| | 4|303650058130002| 5| +---+---------------+--------------+ key条数倒数5个 +---+---------------+--------------+ | 条数| merchant_id|statistic_type| +---+---------------+--------------+ | 1|822100051310533| 6| | 1|822100059630118| 6| | 1|822100052512420| 6| | 1|822100055411357| 6| | 1|822100058124973| 6| +---+---------------+--------------+ ,可是再task里,大部分都平均,就有某一个task数据量是其他的1000倍左右,看起来不像我group by的key分布不均导致的,请教各位大神,这是什么原因导致的?

scala编写从hdfs文件取数据通过phoenix将数据批量插入到hbase,报错对象没有序列化

求大牛指导指导问题所在 批量插入的方法如下 class PhoenixClient_v2(host:String,port:String) extends Serializable { try{ Class.forName("org.apache.phoenix.jdbc.PhoenixDriver") } catch{ case e :ClassNotFoundException => println(e) } def InsertOfBatch(phoenixSQL:String,Key:Array[String],Column:Array[String],filepath:String=null,dataArr:Array[Map[String,Any]]=null) = { val sparkConf = new SparkConf().setMaster("local").setAppName("hdfs2hbase") val sc = new SparkContext(sparkConf) //从json格式的hdfs文件中读取 if (filepath != null) { val url = "jdbc:phoenix:" + host + ":" + port val conn = DriverManager.getConnection(url) val ps = conn.prepareStatement(phoenixSQL) try { //利用spark从文件系统或者hdfs上读取文件 val Rdd_file = sc.textFile(filepath) var HashKey:String = (hashCode()%100).toString //从rdd中取出相应数据到ps中 Rdd_file.map{y=> Key.foreach(x => if (x == "Optime") HashKey += "|"+new JSONObject(y).getString(x).split(" ")(0) else HashKey += "|"+new JSONObject(y).getString(x)) ps.setString(1,HashKey) Column.foreach(x => ps.setString(Column.indexOf(x) + 2, new JSONObject(y).getString(x))) ps.addBatch() } ps.executeBatch() conn.commit() } catch{ case e: SQLException => println("Insert into phoenix from file failed:" + e.toString) }finally { conn.close() ps.close() } } //读取Map[String,Any]映射的集合 if (dataArr != null){ val url = "jdbc:phoenix:" + host + ":" + port val conn = DriverManager.getConnection(url) val ps = conn.prepareStatement(phoenixSQL) try { val Rdd_Arr = sc.parallelize(dataArr) var HashKey:String = (hashCode()%100).toString Rdd_Arr.map{y=> Key.foreach(x => if (x == "Optime") HashKey += "|"+y.get(x).toString.split(" ")(0) else HashKey += "|"+y.get(x) ) ps.setString(1,HashKey) Column.foreach(x => ps.setString(Column.indexOf(x) + 2, y.get(x).toString)) ps.addBatch() } ps.executeBatch() conn.commit() } catch{ case e: SQLException => println("Insert into phoenix from file failed:" + e.toString) }finally { conn.close() ps.close() } } } } 调用的方式如下 object PhoenixTest{ def main(args: Array[String]): Unit = { val pc = new PhoenixClient_v2("zk1","2181") println("select result :"+pc.Select_Sql("select * from \"OP_record\"").toString) pc.CreateTable("create table if not exists \"OP_record\"(\"pk\" varchar(100) not null primary key, \"cf1\".\"curdonate\" varchar(20),\"cf1\".\"curcharge\" varchar(20),\"cf1\".\"uniqueid\" varchar(20))","OP_record") val KeyArr = Array("serverid","userid","Optime") val ColumnArr = Array("curdonate","curcharge","uniqueid") pc.InsertOfBatch("upsert into \"OP_record\" values(?,?,?,?)",KeyArr,ColumnArr,"hdfs://zk1:9000/chen/hive_OP_record_diamonds_20170702.log.1499028228.106") } } 报错信息如下: Exception in thread "main" org.apache.spark.SparkException: Task not serializable at org.apache.spark.util.ClosureCleaner$.ensureSerializable(ClosureCleaner.scala:304) at org.apache.spark.util.ClosureCleaner$.org$apache$spark$util$ClosureCleaner$$clean(ClosureCleaner.scala:294) at org.apache.spark.util.ClosureCleaner$.clean(ClosureCleaner.scala:122) at org.apache.spark.SparkContext.clean(SparkContext.scala:2032) at org.apache.spark.rdd.RDD$$anonfun$map$1.apply(RDD.scala:318) at org.apache.spark.rdd.RDD$$anonfun$map$1.apply(RDD.scala:317) at org.apache.spark.rdd.RDDOperationScope$.withScope(RDDOperationScope.scala:147) at org.apache.spark.rdd.RDDOperationScope$.withScope(RDDOperationScope.scala:108) at org.apache.spark.rdd.RDD.withScope(RDD.scala:310) at org.apache.spark.rdd.RDD.map(RDD.scala:317) at scala.Hbase.PhoenixClient_v2.InsertOfBatch(PhoenixClient_v2.scala:123) at scala.Hbase.PhoenixTest$.main(PhoenixClient_v2.scala:230) at scala.Hbase.PhoenixTest.main(PhoenixClient_v2.scala) at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method) at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:57) at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43) at java.lang.reflect.Method.invoke(Method.java:606) at org.apache.spark.deploy.SparkSubmit$.org$apache$spark$deploy$SparkSubmit$$runMain(SparkSubmit.scala:674) at org.apache.spark.deploy.SparkSubmit$.doRunMain$1(SparkSubmit.scala:180) at org.apache.spark.deploy.SparkSubmit$.submit(SparkSubmit.scala:205) at org.apache.spark.deploy.SparkSubmit$.main(SparkSubmit.scala:120) at org.apache.spark.deploy.SparkSubmit.main(SparkSubmit.scala) Caused by: java.io.NotSerializableException: org.apache.phoenix.jdbc.PhoenixPreparedStatement Serialization stack: - object not serializable (class: org.apache.phoenix.jdbc.PhoenixPreparedStatement, value: upsert into "OP_record" values(?,?,?,?)) - field (class: scala.Hbase.PhoenixClient_v2$$anonfun$InsertOfBatch$1, name: ps$1, type: interface java.sql.PreparedStatement) - object (class scala.Hbase.PhoenixClient_v2$$anonfun$InsertOfBatch$1, <function1>) at org.apache.spark.serializer.SerializationDebugger$.improveException(SerializationDebugger.scala:40) at org.apache.spark.serializer.JavaSerializationStream.writeObject(JavaSerializer.scala:47) at org.apache.spark.serializer.JavaSerializerInstance.serialize(JavaSerializer.scala:84) at org.apache.spark.util.ClosureCleaner$.ensureSerializable(ClosureCleaner.scala:301) ... 21 more 17/07/06 16:35:26 INFO SparkContext: Invoking stop() from shutdown hook 17/07/06 16:35:26 INFO SparkUI: Stopped Spark web UI at http://192.168.12.243:4040 17/07/06 16:35:26 INFO DAGScheduler: Stopping DAGScheduler 17/07/06 16:35:27 INFO MapOutputTrackerMasterEndpoint: MapOutputTrackerMasterEndpoint stopped! 17/07/06 16:35:27 INFO MemoryStore: MemoryStore cleared 17/07/06 16:35:27 INFO BlockManager: BlockManager stopped 17/07/06 16:35:27 INFO BlockManagerMaster: BlockManagerMaster stopped 17/07/06 16:35:27 INFO OutputCommitCoordinator$OutputCommitCoordinatorEndpoint: OutputCommitCoordinator stopped! 17/07/06 16:35:27 INFO SparkContext: Successfully stopped SparkContext 17/07/06 16:35:27 INFO ShutdownHookManager: Shutdown hook called 17/07/06 16:35:27 INFO ShutdownHookManager: Deleting directory /tmp/spark-18cc3337-cdda-498f-a7f3-124f239a6bfe

CSV文件列(空字符串值为NULL)

<div class="post-text" itemprop="text"> <p>I'm trying to import a data set from a CSV file. This works correctly except that blank values don't get inserted as NULL. </p> <p>Somebody try to tell me to use load infile function but I really don't get it can someone help me out in this one here is the code.</p> <pre><code>if ($_FILES[csv][size] &gt; 0) { //get the csv file $file = mysql_real_escape_string($_FILES[csv][tmp_name]); $handle = fopen($file,"r"); //loop through the csv file and insert into database do { if($data[0]){ mysql_query("INSERT IGNORE INTO faculty(FCode,FName,MName,LName,Gender,image_name,BDate,Title,Service,EmpStat,CollegeID,DepartmentID,dateCreated) values('".mysql_real_escape_string($data[0])."','".mysql_real_escape_string($data[1])."','" . mysql_real_escape_string($data[2])."','" . mysql_real_escape_string($data[3])."','" . mysql_real_escape_string($data[4])."','" . mysql_real_escape_string($data[5])."','" . mysql_real_escape_string($data[6])."','" . mysql_real_escape_string($data[7])."','" . mysql_real_escape_string($data[8])."','" . mysql_real_escape_string($data[9])."','" . mysql_real_escape_string($data[10])."','" . mysql_real_escape_string($data[11])."','$currentDate')") or die ("error in sql syntax" .mysql_error()); } } while ($data = fgetcsv($handle,1000,",","'")); </code></pre> </div>

win10下运行hadoop程序报错

在win10的eclipse上运行一段代码需要用到hadoop。一开始报错:Could not locate executable null\bin\winutils.exe in the Hadoop binaries 按照指导下载了winutils.exe 并且配置了环境变量。再运行程序又报错: Exception in thread "main" java.lang.RuntimeException: java.lang.RuntimeException: The root scratch dir: /tmp/hive on HDFS should be writable. Current permissions are: --------- at org.apache.hadoop.hive.ql.session.SessionState.start(SessionState.java:522) at org.apache.spark.sql.hive.client.ClientWrapper.<init>(ClientWrapper.scala:171) at org.apache.spark.sql.hive.HiveContext.executionHive$lzycompute(HiveContext.scala:162) at org.apache.spark.sql.hive.HiveContext.executionHive(HiveContext.scala:160) at org.apache.spark.sql.hive.HiveContext.<init>(HiveContext.scala:167) at org.apache.spark.sql.CarbonContext.<init>(CarbonContext.scala:34) at org.carbondata.examples.util.InitForExamples$.createCarbonContext(InitForExamples.scala:42) at org.carbondata.examples.CarbonExample$.main(CarbonExample.scala:26) at org.carbondata.examples.CarbonExample.main(CarbonExample.scala) Caused by: java.lang.RuntimeException: The root scratch dir: /tmp/hive on HDFS should be writable. Current permissions are: --------- at org.apache.hadoop.hive.ql.session.SessionState.createRootHDFSDir(SessionState.java:612) at org.apache.hadoop.hive.ql.session.SessionState.createSessionDirs(SessionState.java:554) at org.apache.hadoop.hive.ql.session.SessionState.start(SessionState.java:508) ... 8 more ``` 按照指导在win10 cmd窗口后运行 winutils.exe chmod 777 /tmp/hive 报如下错误: ChangeFileModeByMask error (3): ??????????? 哪位大神现身救助

2019 AI开发者大会

2019 AI开发者大会(AI ProCon 2019)是由中国IT社区CSDN主办的AI技术与产业年度盛会。多年经验淬炼,如今蓄势待发:2019年9月6-7日,大会将有近百位中美顶尖AI专家、知名企业代表以及千余名AI开发者齐聚北京,进行技术解读和产业论证。我们不空谈口号,只谈技术,诚挚邀请AI业内人士一起共铸人工智能新篇章!

实现简单的文件系统

实验内容: 通过对具体的文件存储空间的管理、文件的物理结构、目录结构和文件操作的实现,加深对文件系统内部功能和实现过程的理解。 要求: 1.在内存中开辟一个虚拟磁盘空间作为文件存储器,在其上实现一个简

MIPS单周期CPU-组成原理实验-华中科技大学

使用logisim布线完成的MIPS单周期CPU,可支持28条指令。跑马灯的代码已经装入了寄存器,可以直接开启时钟运行。

2019数学建模A题高压油管的压力控制 省一论文即代码

2019数学建模A题高压油管的压力控制省一完整论文即详细C++和Matlab代码,希望对同学们有所帮助

基于QT和OpenCV的五子棋实现源码

一个简单的五子棋应用,基于QT和OpenCV的实现源码,通过相邻棋子判断是否获胜,不包含人工智能算法,适合新手入门

Git 实用技巧

这几年越来越多的开发团队使用了Git,掌握Git的使用已经越来越重要,已经是一个开发者必备的一项技能;但很多人在刚开始学习Git的时候会遇到很多疑问,比如之前使用过SVN的开发者想不通Git提交代码为什么需要先commit然后再去push,而不是一条命令一次性搞定; 更多的开发者对Git已经入门,不过在遇到一些代码冲突、需要恢复Git代码时候就不知所措,这个时候哪些对 Git掌握得比较好的少数人,就像团队中的神一样,在队友遇到 Git 相关的问题的时候用各种流利的操作来帮助队友于水火。 我去年刚加入新团队,发现一些同事对Git的常规操作没太大问题,但对Git的理解还是比较生疏,比如说分支和分支之间的关联关系、合并代码时候的冲突解决、提交代码前未拉取新代码导致冲突问题的处理等,我在协助处理这些问题的时候也记录各种问题的解决办法,希望整理后通过教程帮助到更多对Git操作进阶的开发者。 本期教程学习方法分为“掌握基础——稳步进阶——熟悉协作”三个层次。从掌握基础的 Git的推送和拉取开始,以案例进行演示,分析每一个步骤的操作方式和原理,从理解Git 工具的操作到学会代码存储结构、演示不同场景下Git遇到问题的不同处理方案。循序渐进让同学们掌握Git工具在团队协作中的整体协作流程。 在教程中会通过大量案例进行分析,案例会模拟在工作中遇到的问题,从最基础的代码提交和拉取、代码冲突解决、代码仓库的数据维护、Git服务端搭建等。为了让同学们容易理解,对Git简单易懂,文章中详细记录了详细的操作步骤,提供大量演示截图和解析。在教程的最后部分,会从提升团队整体效率的角度对Git工具进行讲解,包括规范操作、Gitlab的搭建、钩子事件的应用等。 为了让同学们可以利用碎片化时间来灵活学习,在教程文章中大程度降低了上下文的依赖,让大家可以在工作之余进行学习与实战,并同时掌握里面涉及的Git不常见操作的相关知识,理解Git工具在工作遇到的问题解决思路和方法,相信一定会对大家的前端技能进阶大有帮助。

实用主义学Python(小白也容易上手的Python实用案例)

原价169,限时立减100元! 系统掌握Python核心语法16点,轻松应对工作中80%以上的Python使用场景! 69元=72讲+源码+社群答疑+讲师社群分享会&nbsp; 【哪些人适合学习这门课程?】 1)大学生,平时只学习了Python理论,并未接触Python实战问题; 2)对Python实用技能掌握薄弱的人,自动化、爬虫、数据分析能让你快速提高工作效率; 3)想学习新技术,如:人工智能、机器学习、深度学习等,这门课程是你的必修课程; 4)想修炼更好的编程内功,优秀的工程师肯定不能只会一门语言,Python语言功能强大、使用高效、简单易学。 【超实用技能】 从零开始 自动生成工作周报 职场升级 豆瓣电影数据爬取 实用案例 奥运冠军数据分析 自动化办公:通过Python自动化分析Excel数据并自动操作Word文档,最终获得一份基于Excel表格的数据分析报告。 豆瓣电影爬虫:通过Python自动爬取豆瓣电影信息并将电影图片保存到本地。 奥运会数据分析实战 简介:通过Python分析120年间奥运会的数据,从不同角度入手分析,从而得出一些有趣的结论。 【超人气老师】 二两 中国人工智能协会高级会员 生成对抗神经网络研究者 《深入浅出生成对抗网络:原理剖析与TensorFlow实现》一书作者 阿里云大学云学院导师 前大型游戏公司后端工程师 【超丰富实用案例】 0)图片背景去除案例 1)自动生成工作周报案例 2)豆瓣电影数据爬取案例 3)奥运会数据分析案例 4)自动处理邮件案例 5)github信息爬取/更新提醒案例 6)B站百大UP信息爬取与分析案例 7)构建自己的论文网站案例

深度学习原理+项目实战+算法详解+主流框架(套餐)

深度学习系列课程从深度学习基础知识点开始讲解一步步进入神经网络的世界再到卷积和递归神经网络,详解各大经典网络架构。实战部分选择当下最火爆深度学习框架PyTorch与Tensorflow/Keras,全程实战演示框架核心使用与建模方法。项目实战部分选择计算机视觉与自然语言处理领域经典项目,从零开始详解算法原理,debug模式逐行代码解读。适合准备就业和转行的同学们加入学习! 建议按照下列课程顺序来进行学习 (1)掌握深度学习必备经典网络架构 (2)深度框架实战方法 (3)计算机视觉与自然语言处理项目实战。(按照课程排列顺序即可)

C/C++跨平台研发从基础到高阶实战系列套餐

一 专题从基础的C语言核心到c++ 和stl完成基础强化; 二 再到数据结构,设计模式完成专业计算机技能强化; 三 通过跨平台网络编程,linux编程,qt界面编程,mfc编程,windows编程,c++与lua联合编程来完成应用强化 四 最后通过基于ffmpeg的音视频播放器,直播推流,屏幕录像,

三个项目玩转深度学习(附1G源码)

从事大数据与人工智能开发与实践约十年,钱老师亲自见证了大数据行业的发展与人工智能的从冷到热。事实证明,计算机技术的发展,算力突破,海量数据,机器人技术等,开启了第四次工业革命的序章。深度学习图像分类一直是人工智能的经典任务,是智慧零售、安防、无人驾驶等机器视觉应用领域的核心技术之一,掌握图像分类技术是机器视觉学习的重中之重。针对现有线上学习的特点与实际需求,我们开发了人工智能案例实战系列课程。打造:以项目案例实践为驱动的课程学习方式,覆盖了智能零售,智慧交通等常见领域,通过基础学习、项目案例实践、社群答疑,三维立体的方式,打造最好的学习效果。

Java基础知识面试题(2020最新版)

文章目录Java概述何为编程什么是Javajdk1.5之后的三大版本JVM、JRE和JDK的关系什么是跨平台性?原理是什么Java语言有哪些特点什么是字节码?采用字节码的最大好处是什么什么是Java程序的主类?应用程序和小程序的主类有何不同?Java应用程序与小程序之间有那些差别?Java和C++的区别Oracle JDK 和 OpenJDK 的对比基础语法数据类型Java有哪些数据类型switc...

Python界面版学生管理系统

前不久上传了一个控制台版本的学生管理系统,这个是Python界面版学生管理系统,这个是使用pycharm开发的一个有界面的学生管理系统,基本的增删改查,里面又演示视频和完整代码,有需要的伙伴可以自行下

Vue.js 2.0之全家桶系列视频课程

基于新的Vue.js 2.3版本, 目前新全的Vue.js教学视频,让你少走弯路,直达技术前沿! 1. 包含Vue.js全家桶(vue.js、vue-router、axios、vuex、vue-cli、webpack、ElementUI等) 2. 采用笔记+代码案例的形式讲解,通俗易懂

linux“开发工具三剑客”速成攻略

工欲善其事,必先利其器。Vim+Git+Makefile是Linux环境下嵌入式开发常用的工具。本专题主要面向初次接触Linux的新手,熟练掌握工作中常用的工具,在以后的学习和工作中提高效率。

JAVA初级工程师面试36问(完结)

第三十一问: 说一下线程中sleep()和wait()区别? 1 . sleep()是让正在执行的线程主动让出CPU,当时间到了,在回到自己的线程让程序运行。但是它并没有释放同步资源锁只是让出。 2.wait()是让当前线程暂时退让出同步资源锁,让其他线程来获取到这个同步资源在调用notify()方法,才会让其解除wait状态,再次参与抢资源。 3. sleep()方法可以在任何地方使用,而wait()只能在同步方法或同步块使用。 ...

java jdk 8 帮助文档 中文 文档 chm 谷歌翻译

JDK1.8 API 中文谷歌翻译版 java帮助文档 JDK API java 帮助文档 谷歌翻译 JDK1.8 API 中文 谷歌翻译版 java帮助文档 Java最新帮助文档 本帮助文档是使用谷

我以为我对Mysql事务很熟,直到我遇到了阿里面试官

太惨了,面试又被吊打

智鼎(附答案).zip

并不是完整题库,但是有智鼎在线2019年9、10、11三个月的试题,有十七套以上题目,普通的网申行测题足以对付,可以在做题时自己总结一些规律,都不是很难

Visual Assist X 破解补丁

vs a's'sixt插件 支持vs2008-vs2019 亲测可以破解,希望可以帮助到大家

150讲轻松搞定Python网络爬虫

【为什么学爬虫?】 &nbsp; &nbsp; &nbsp; &nbsp;1、爬虫入手容易,但是深入较难,如何写出高效率的爬虫,如何写出灵活性高可扩展的爬虫都是一项技术活。另外在爬虫过程中,经常容易遇到被反爬虫,比如字体反爬、IP识别、验证码等,如何层层攻克难点拿到想要的数据,这门课程,你都能学到! &nbsp; &nbsp; &nbsp; &nbsp;2、如果是作为一个其他行业的开发者,比如app开发,web开发,学习爬虫能让你加强对技术的认知,能够开发出更加安全的软件和网站 【课程设计】 一个完整的爬虫程序,无论大小,总体来说可以分成三个步骤,分别是: 网络请求:模拟浏览器的行为从网上抓取数据。 数据解析:将请求下来的数据进行过滤,提取我们想要的数据。 数据存储:将提取到的数据存储到硬盘或者内存中。比如用mysql数据库或者redis等。 那么本课程也是按照这几个步骤循序渐进的进行讲解,带领学生完整的掌握每个步骤的技术。另外,因为爬虫的多样性,在爬取的过程中可能会发生被反爬、效率低下等。因此我们又增加了两个章节用来提高爬虫程序的灵活性,分别是: 爬虫进阶:包括IP代理,多线程爬虫,图形验证码识别、JS加密解密、动态网页爬虫、字体反爬识别等。 Scrapy和分布式爬虫:Scrapy框架、Scrapy-redis组件、分布式爬虫等。 通过爬虫进阶的知识点我们能应付大量的反爬网站,而Scrapy框架作为一个专业的爬虫框架,使用他可以快速提高我们编写爬虫程序的效率和速度。另外如果一台机器不能满足你的需求,我们可以用分布式爬虫让多台机器帮助你快速爬取数据。 &nbsp; 从基础爬虫到商业化应用爬虫,本套课程满足您的所有需求! 【课程服务】 专属付费社群+每周三讨论会+1v1答疑

JavaWEB商城项目(包括数据库)

功能描述:包括用户的登录注册,以及个人资料的修改.商品的分类展示,详情,加入购物车,生成订单,到银行支付等!另外还有收货地址的和我的收藏等常用操作.环境(JDK 1.7 ,mysql 5.5,Ecli

Python数据挖掘简易入门

&nbsp; &nbsp; &nbsp; &nbsp; 本课程为Python数据挖掘方向的入门课程,课程主要以真实数据为基础,详细介绍数据挖掘入门的流程和使用Python实现pandas与numpy在数据挖掘方向的运用,并深入学习如何运用scikit-learn调用常用的数据挖掘算法解决数据挖掘问题,为进一步深入学习数据挖掘打下扎实的基础。

一学即懂的计算机视觉(第一季)

图像处理和计算机视觉的课程大家已经看过很多,但总有“听不透”,“用不了”的感觉。课程致力于创建人人都能听的懂的计算机视觉,通过生动、细腻的讲解配合实战演练,让学生真正学懂、用会。 【超实用课程内容】 课程内容分为三篇,包括视觉系统构成,图像处理基础,特征提取与描述,运动跟踪,位姿估计,三维重构等内容。课程理论与实战结合,注重教学内容的可视化和工程实践,为人工智能视觉研发及算法工程师等相关高薪职位就业打下坚实基础。 【课程如何观看?】 PC端:https://edu.csdn.net/course/detail/26281 移动端:CSDN 学院APP(注意不是CSDN APP哦) 本课程为录播课,课程2年有效观看时长,但是大家可以抓紧时间学习后一起讨论哦~ 【学员专享增值服务】 源码开放 课件、课程案例代码完全开放给你,你可以根据所学知识,自行修改、优化 下载方式:电脑登录https://edu.csdn.net/course/detail/26281,点击右下方课程资料、代码、课件等打包下载

软件测试2小时入门

本课程内容系统、全面、简洁、通俗易懂,通过2个多小时的介绍,让大家对软件测试有个系统的理解和认识,具备基本的软件测试理论基础。 主要内容分为5个部分: 1 软件测试概述,了解测试是什么、测试的对象、原则、流程、方法、模型;&nbsp; 2.常用的黑盒测试用例设计方法及示例演示;&nbsp; 3 常用白盒测试用例设计方法及示例演示;&nbsp; 4.自动化测试优缺点、使用范围及示例‘;&nbsp; 5.测试经验谈。

初级玩转Linux+Ubuntu(嵌入式开发基础课程)

课程主要面向嵌入式Linux初学者、工程师、学生 主要从一下几方面进行讲解: 1.linux学习路线、基本命令、高级命令 2.shell、vi及vim入门讲解 3.软件安装下载、NFS、Samba、FTP等服务器配置及使用

2019 Python开发者日-培训

本次活动将秉承“只讲技术,拒绝空谈”的理念,邀请十余位身处一线的Python技术专家,重点围绕Web开发、自动化运维、数据分析、人工智能等技术模块,分享真实生产环境中使用Python应对IT挑战的真知灼见。此外,针对不同层次的开发者,大会还安排了深度培训实操环节,为开发者们带来更多深度实战的机会。

快速入门Android开发 视频 教程 android studio

这是一门快速入门Android开发课程,顾名思义是让大家能快速入门Android开发。 学完能让你学会如下知识点: Android的发展历程 搭建Java开发环境 搭建Android开发环境 Android Studio基础使用方法 Android Studio创建项目 项目运行到模拟器 项目运行到真实手机 Android中常用控件 排查开发中的错误 Android中请求网络 常用Android开发命令 快速入门Gradle构建系统 项目实战:看美图 常用Android Studio使用技巧 项目签名打包 如何上架市场

机器学习初学者必会的案例精讲

通过六个实际的编码项目,带领同学入门人工智能。这些项目涉及机器学习(回归,分类,聚类),深度学习(神经网络),底层数学算法,Weka数据挖掘,利用Git开源项目实战等。

4小时玩转微信小程序——基础入门与微信支付实战

这是一个门针对零基础学员学习微信小程序开发的视频教学课程。课程采用腾讯官方文档作为教程的唯一技术资料来源。杜绝网络上质量良莠不齐的资料给学员学习带来的障碍。 视频课程按照开发工具的下载、安装、使用、程序结构、视图层、逻辑层、微信小程序等几个部分组织课程,详细讲解整个小程序的开发过程

YOLOv3目标检测实战系列课程

《YOLOv3目标检测实战系列课程》旨在帮助大家掌握YOLOv3目标检测的训练、原理、源码与网络模型改进方法。 本课程的YOLOv3使用原作darknet(c语言编写),在Ubuntu系统上做项目演示。 本系列课程包括三门课: (1)《YOLOv3目标检测实战:训练自己的数据集》 包括:安装darknet、给自己的数据集打标签、整理自己的数据集、修改配置文件、训练自己的数据集、测试训练出的网络模型、性能统计(mAP计算和画出PR曲线)和先验框聚类。 (2)《YOLOv3目标检测:原理与源码解析》讲解YOLOv1、YOLOv2、YOLOv3的原理、程序流程并解析各层的源码。 (3)《YOLOv3目标检测:网络模型改进方法》讲解YOLOv3的改进方法,包括改进1:不显示指定类别目标的方法 (增加功能) ;改进2:合并BN层到卷积层 (加快推理速度) ; 改进3:使用GIoU指标和损失函数 (提高检测精度) ;改进4:tiny YOLOv3 (简化网络模型)并介绍 AlexeyAB/darknet项目。

Qt5 局域网通信软件(模仿QQ)

采用Qt5进行开发的局域网通信客户端+Server,界面模仿QQ的界面,聊天界面采用QWidget绘制的气泡!

Python数据清洗实战入门

本次课程主要以真实的电商数据为基础,通过Python详细的介绍了数据分析中的数据清洗阶段各种技巧和方法。

YOLOv3目标检测实战:训练自己的数据集

YOLOv3是一种基于深度学习的端到端实时目标检测方法,以速度快见长。本课程将手把手地教大家使用labelImg标注和使用YOLOv3训练自己的数据集。课程分为三个小项目:足球目标检测(单目标检测)、梅西目标检测(单目标检测)、足球和梅西同时目标检测(两目标检测)。 本课程的YOLOv3使用Darknet,在Ubuntu系统上做项目演示。包括:安装Darknet、给自己的数据集打标签、整理自己的数据集、修改配置文件、训练自己的数据集、测试训练出的网络模型、性能统计(mAP计算和画出PR曲线)和先验框聚类。 Darknet是使用C语言实现的轻型开源深度学习框架,依赖少,可移植性好,值得深入探究。 除本课程《YOLOv3目标检测实战:训练自己的数据集》外,本人推出了有关YOLOv3目标检测的系列课程,请持续关注该系列的其它课程视频,包括: 《YOLOv3目标检测实战:交通标志识别》 《YOLOv3目标检测:原理与源码解析》 《YOLOv3目标检测:网络模型改进方法》 敬请关注并选择学习!

Python可以这样学(第一季:Python内功修炼)

董付国系列教材《Python程序设计基础》、《Python程序设计(第2版)》、《Python可以这样学》配套视频,讲解Python 3.5.x和3.6.x语法、内置对象用法、选择与循环以及函数设计与使用、lambda表达式用法、字符串与正则表达式应用、面向对象编程、文本文件与二进制文件操作、目录操作与系统运维、异常处理结构。

ANSI-VITA-46.0-2007(中文翻译版)_V1.0.pdf

ANSI-VITA 46.0 2007老版本的的英文版本的中文翻译版本,含目录,通过谷歌翻译进行文字翻译,翻译的不是很专业,对初学者理解VPX接口及协议有一定的帮助,后续会对翻译进行改进。

Java 最常见的 200+ 面试题:面试必备

这份面试清单是从我 2015 年做了 TeamLeader 之后开始收集的,一方面是给公司招聘用,另一方面是想用它来挖掘在 Java 技术栈中,还有那些知识点是我不知道的,我想找到这些技术盲点,然后修复它,以此来提高自己的技术水平。虽然我是从 2009 年就开始参加编程工作了,但我依旧觉得自己现在要学的东西很多,并且学习这些知识,让我很有成就感和满足感,那所以何乐而不为呢? 说回面试的事,这份面试...

相关热词 c#中如何设置提交按钮 c#帮助怎么用 c# 读取合并单元格的值 c#带阻程序 c# 替换span内容 c# rpc c#控制台点阵字输出 c#do while循环 c#调用dll多线程 c#找出两个集合不同的
立即提问