spark(自带hive)不能读取主子表的数据

【问题详细描述】
spark(自带hive)读取不了主子表的数据,非主表的数据可以读取。spark版本:spark-1.3.0-bin-hadoop2.4
使用的jar包:
spark-sequoiadb-1.12.jar
sequoiadb-driver-1.12.jar
hadoop-sequoiadb-1.12.jar
hive-sequoiadb-1.12.jar
postgresql-9.4-1201-jdbc41.jar
查询主表错误如下:
select * from test201607_cs.tb_order limit 1 ;

Error: org.apache.spark.SparkException: Job aborted due to stage failure: Task 0 in stage 16.0 failed 4 times, most recent failure: Lost task 0.3 in stage 16.0 (TID 362, sdb-223.3golden.hq): com.sequoiadb.exception.BaseException: errorType:SDB_DMS_CS_NOTEXIST,Collection space does not exist
Exception Detail:test201607_cs
at com.sequoiadb.base.Sequoiadb.getCollectionSpace(Sequoiadb.java:598)
at com.sequoiadb.hive.SdbReader.(SdbReader.java:145)
at com.sequoiadb.hive.SdbHiveInputFormat.getRecordReader(SdbHiveInputFormat.java:120)
at org.apache.spark.rdd.HadoopRDD$anon$1.(HadoopRDD.scala:236)
at org.apache.spark.rdd.HadoopRDD.compute(HadoopRDD.scala:212)
复制代码

查询非主表结果:
select * from test201607_cs.test_hive limit 1 ;

+----------+
| shop_id |
+----------+
| 10048 |
+----------+

1个回答

可以使用spark的连接器:

CREATE table st_order ( shop_id string, date string) using com.sequoiadb.spark OPTIONS ( host 'localhost:11810', collectionspace 'test201607_cs', collection 'st_order');
复制代码

注意date这种关键字需要使用``括起来。

Csdn user default icon
上传中...
上传图片
插入图片
抄袭、复制答案,以达到刷声望分或其他目的的行为,在CSDN问答是严格禁止的,一经发现立刻封号。是时候展现真正的技术了!
其他相关推荐
[hive] hive on spark hive.exec.reducers.bytes.per.reducer参数值和实际数据量不一样
hive on spark 在运行sql时,想动态控制reduce的数据,就设置了set hive.exec.reducers.bytes.per.reducer = 256000000; 但是发觉reduce变成了1个,实际数据有大概2g左右。 后来把set hive.exec.reducers.bytes.per.reducer = 32000000; 发觉reduce变成了7个 ![图片说明](https://img-ask.csdn.net/upload/201911/28/1574948343_708239.png) 切换成 hive on mr时,set hive.exec.reducers.bytes.per.reducer = 256000000又有用了 求助~ 另外发现 on mr合并小文件的参数在 on spark中设置的大小都没效果?
spark 读取不到hive metastore 获取不到数据库
直接上异常 ``` Using Spark's default log4j profile: org/apache/spark/log4j-defaults.properties SLF4J: Class path contains multiple SLF4J bindings. SLF4J: Found binding in [jar:file:/data01/hadoop/yarn/local/filecache/355/spark2-hdp-yarn-archive.tar.gz/slf4j-log4j12-1.7.16.jar!/org/slf4j/impl/StaticLoggerBinder.class] SLF4J: Found binding in [jar:file:/usr/hdp/2.6.5.0-292/hadoop/lib/slf4j-log4j12-1.7.10.jar!/org/slf4j/impl/StaticLoggerBinder.class] SLF4J: See http://www.slf4j.org/codes.html#multiple_bindings for an explanation. SLF4J: Actual binding is of type [org.slf4j.impl.Log4jLoggerFactory] 19/08/13 19:53:17 INFO SignalUtils: Registered signal handler for TERM 19/08/13 19:53:17 INFO SignalUtils: Registered signal handler for HUP 19/08/13 19:53:17 INFO SignalUtils: Registered signal handler for INT 19/08/13 19:53:17 INFO SecurityManager: Changing view acls to: yarn,hdfs 19/08/13 19:53:17 INFO SecurityManager: Changing modify acls to: yarn,hdfs 19/08/13 19:53:17 INFO SecurityManager: Changing view acls groups to: 19/08/13 19:53:17 INFO SecurityManager: Changing modify acls groups to: 19/08/13 19:53:17 INFO SecurityManager: SecurityManager: authentication disabled; ui acls disabled; users with view permissions: Set(yarn, hdfs); groups with view permissions: Set(); users with modify permissions: Set(yarn, hdfs); groups with modify permissions: Set() 19/08/13 19:53:18 INFO ApplicationMaster: Preparing Local resources 19/08/13 19:53:19 INFO ApplicationMaster: ApplicationAttemptId: appattempt_1565610088533_0087_000001 19/08/13 19:53:19 INFO ApplicationMaster: Starting the user application in a separate Thread 19/08/13 19:53:19 INFO ApplicationMaster: Waiting for spark context initialization... 19/08/13 19:53:19 INFO SparkContext: Running Spark version 2.3.0.2.6.5.0-292 19/08/13 19:53:19 INFO SparkContext: Submitted application: voice_stream 19/08/13 19:53:19 INFO SecurityManager: Changing view acls to: yarn,hdfs 19/08/13 19:53:19 INFO SecurityManager: Changing modify acls to: yarn,hdfs 19/08/13 19:53:19 INFO SecurityManager: Changing view acls groups to: 19/08/13 19:53:19 INFO SecurityManager: Changing modify acls groups to: 19/08/13 19:53:19 INFO SecurityManager: SecurityManager: authentication disabled; ui acls disabled; users with view permissions: Set(yarn, hdfs); groups with view permissions: Set(); users with modify permissions: Set(yarn, hdfs); groups with modify permissions: Set() 19/08/13 19:53:19 INFO Utils: Successfully started service 'sparkDriver' on port 20410. 19/08/13 19:53:19 INFO SparkEnv: Registering MapOutputTracker 19/08/13 19:53:19 INFO SparkEnv: Registering BlockManagerMaster 19/08/13 19:53:19 INFO BlockManagerMasterEndpoint: Using org.apache.spark.storage.DefaultTopologyMapper for getting topology information 19/08/13 19:53:19 INFO BlockManagerMasterEndpoint: BlockManagerMasterEndpoint up 19/08/13 19:53:19 INFO DiskBlockManager: Created local directory at /data01/hadoop/yarn/local/usercache/hdfs/appcache/application_1565610088533_0087/blockmgr-94d35b97-43b2-496e-a4cb-73ecd3ed186c 19/08/13 19:53:19 INFO MemoryStore: MemoryStore started with capacity 366.3 MB 19/08/13 19:53:19 INFO SparkEnv: Registering OutputCommitCoordinator 19/08/13 19:53:19 INFO JettyUtils: Adding filter: org.apache.hadoop.yarn.server.webproxy.amfilter.AmIpFilter 19/08/13 19:53:19 INFO Utils: Successfully started service 'SparkUI' on port 28852. 19/08/13 19:53:19 INFO SparkUI: Bound SparkUI to 0.0.0.0, and started at http://datanode02:28852 19/08/13 19:53:19 INFO YarnClusterScheduler: Created YarnClusterScheduler 19/08/13 19:53:20 INFO SchedulerExtensionServices: Starting Yarn extension services with app application_1565610088533_0087 and attemptId Some(appattempt_1565610088533_0087_000001) 19/08/13 19:53:20 INFO Utils: Successfully started service 'org.apache.spark.network.netty.NettyBlockTransferService' on port 31984. 19/08/13 19:53:20 INFO NettyBlockTransferService: Server created on datanode02:31984 19/08/13 19:53:20 INFO BlockManager: Using org.apache.spark.storage.RandomBlockReplicationPolicy for block replication policy 19/08/13 19:53:20 INFO BlockManagerMaster: Registering BlockManager BlockManagerId(driver, datanode02, 31984, None) 19/08/13 19:53:20 INFO BlockManagerMasterEndpoint: Registering block manager datanode02:31984 with 366.3 MB RAM, BlockManagerId(driver, datanode02, 31984, None) 19/08/13 19:53:20 INFO BlockManagerMaster: Registered BlockManager BlockManagerId(driver, datanode02, 31984, None) 19/08/13 19:53:20 INFO BlockManager: Initialized BlockManager: BlockManagerId(driver, datanode02, 31984, None) 19/08/13 19:53:20 INFO EventLoggingListener: Logging events to hdfs:/spark2-history/application_1565610088533_0087_1 19/08/13 19:53:20 INFO ApplicationMaster: =============================================================================== YARN executor launch context: env: CLASSPATH -> {{PWD}}<CPS>{{PWD}}/__spark_conf__<CPS>{{PWD}}/__spark_libs__/*<CPS>/usr/hdp/2.6.5.0-292/hadoop/conf<CPS>/usr/hdp/2.6.5.0-292/hadoop/*<CPS>/usr/hdp/2.6.5.0-292/hadoop/lib/*<CPS>/usr/hdp/current/hadoop-hdfs-client/*<CPS>/usr/hdp/current/hadoop-hdfs-client/lib/*<CPS>/usr/hdp/current/hadoop-yarn-client/*<CPS>/usr/hdp/current/hadoop-yarn-client/lib/*<CPS>/usr/hdp/current/ext/hadoop/*<CPS>$PWD/mr-framework/hadoop/share/hadoop/mapreduce/*:$PWD/mr-framework/hadoop/share/hadoop/mapreduce/lib/*:$PWD/mr-framework/hadoop/share/hadoop/common/*:$PWD/mr-framework/hadoop/share/hadoop/common/lib/*:$PWD/mr-framework/hadoop/share/hadoop/yarn/*:$PWD/mr-framework/hadoop/share/hadoop/yarn/lib/*:$PWD/mr-framework/hadoop/share/hadoop/hdfs/*:$PWD/mr-framework/hadoop/share/hadoop/hdfs/lib/*:$PWD/mr-framework/hadoop/share/hadoop/tools/lib/*:/usr/hdp/2.6.5.0-292/hadoop/lib/hadoop-lzo-0.6.0.2.6.5.0-292.jar:/etc/hadoop/conf/secure:/usr/hdp/current/ext/hadoop/*<CPS>{{PWD}}/__spark_conf__/__hadoop_conf__ SPARK_YARN_STAGING_DIR -> *********(redacted) SPARK_USER -> *********(redacted) command: LD_LIBRARY_PATH="/usr/hdp/current/hadoop-client/lib/native:/usr/hdp/current/hadoop-client/lib/native/Linux-amd64-64:$LD_LIBRARY_PATH" \ {{JAVA_HOME}}/bin/java \ -server \ -Xmx5120m \ -Djava.io.tmpdir={{PWD}}/tmp \ '-Dspark.history.ui.port=18081' \ '-Dspark.rpc.message.maxSize=100' \ -Dspark.yarn.app.container.log.dir=<LOG_DIR> \ -XX:OnOutOfMemoryError='kill %p' \ org.apache.spark.executor.CoarseGrainedExecutorBackend \ --driver-url \ spark://CoarseGrainedScheduler@datanode02:20410 \ --executor-id \ <executorId> \ --hostname \ <hostname> \ --cores \ 2 \ --app-id \ application_1565610088533_0087 \ --user-class-path \ file:$PWD/__app__.jar \ --user-class-path \ file:$PWD/hadoop-common-2.7.3.jar \ --user-class-path \ file:$PWD/guava-12.0.1.jar \ --user-class-path \ file:$PWD/hbase-server-1.2.8.jar \ --user-class-path \ file:$PWD/hbase-protocol-1.2.8.jar \ --user-class-path \ file:$PWD/hbase-client-1.2.8.jar \ --user-class-path \ file:$PWD/hbase-common-1.2.8.jar \ --user-class-path \ file:$PWD/mysql-connector-java-5.1.44-bin.jar \ --user-class-path \ file:$PWD/spark-streaming-kafka-0-8-assembly_2.11-2.3.2.jar \ --user-class-path \ file:$PWD/spark-examples_2.11-1.6.0-typesafe-001.jar \ --user-class-path \ file:$PWD/fastjson-1.2.7.jar \ 1><LOG_DIR>/stdout \ 2><LOG_DIR>/stderr resources: spark-streaming-kafka-0-8-assembly_2.11-2.3.2.jar -> resource { scheme: "hdfs" host: "CID-042fb939-95b4-4b74-91b8-9f94b999bdf7" port: -1 file: "/user/hdfs/.sparkStaging/application_1565610088533_0087/spark-streaming-kafka-0-8-assembly_2.11-2.3.2.jar" } size: 12271027 timestamp: 1565697198603 type: FILE visibility: PRIVATE spark-examples_2.11-1.6.0-typesafe-001.jar -> resource { scheme: "hdfs" host: "CID-042fb939-95b4-4b74-91b8-9f94b999bdf7" port: -1 file: "/user/hdfs/.sparkStaging/application_1565610088533_0087/spark-examples_2.11-1.6.0-typesafe-001.jar" } size: 1867746 timestamp: 1565697198751 type: FILE visibility: PRIVATE hbase-server-1.2.8.jar -> resource { scheme: "hdfs" host: "CID-042fb939-95b4-4b74-91b8-9f94b999bdf7" port: -1 file: "/user/hdfs/.sparkStaging/application_1565610088533_0087/hbase-server-1.2.8.jar" } size: 4197896 timestamp: 1565697197770 type: FILE visibility: PRIVATE hbase-common-1.2.8.jar -> resource { scheme: "hdfs" host: "CID-042fb939-95b4-4b74-91b8-9f94b999bdf7" port: -1 file: "/user/hdfs/.sparkStaging/application_1565610088533_0087/hbase-common-1.2.8.jar" } size: 570163 timestamp: 1565697198318 type: FILE visibility: PRIVATE __app__.jar -> resource { scheme: "hdfs" host: "CID-042fb939-95b4-4b74-91b8-9f94b999bdf7" port: -1 file: "/user/hdfs/.sparkStaging/application_1565610088533_0087/spark_history_data2.jar" } size: 44924 timestamp: 1565697197260 type: FILE visibility: PRIVATE guava-12.0.1.jar -> resource { scheme: "hdfs" host: "CID-042fb939-95b4-4b74-91b8-9f94b999bdf7" port: -1 file: "/user/hdfs/.sparkStaging/application_1565610088533_0087/guava-12.0.1.jar" } size: 1795932 timestamp: 1565697197614 type: FILE visibility: PRIVATE hbase-client-1.2.8.jar -> resource { scheme: "hdfs" host: "CID-042fb939-95b4-4b74-91b8-9f94b999bdf7" port: -1 file: "/user/hdfs/.sparkStaging/application_1565610088533_0087/hbase-client-1.2.8.jar" } size: 1306401 timestamp: 1565697198180 type: FILE visibility: PRIVATE __spark_conf__ -> resource { scheme: "hdfs" host: "CID-042fb939-95b4-4b74-91b8-9f94b999bdf7" port: -1 file: "/user/hdfs/.sparkStaging/application_1565610088533_0087/__spark_conf__.zip" } size: 273513 timestamp: 1565697199131 type: ARCHIVE visibility: PRIVATE fastjson-1.2.7.jar -> resource { scheme: "hdfs" host: "CID-042fb939-95b4-4b74-91b8-9f94b999bdf7" port: -1 file: "/user/hdfs/.sparkStaging/application_1565610088533_0087/fastjson-1.2.7.jar" } size: 417221 timestamp: 1565697198865 type: FILE visibility: PRIVATE hbase-protocol-1.2.8.jar -> resource { scheme: "hdfs" host: "CID-042fb939-95b4-4b74-91b8-9f94b999bdf7" port: -1 file: "/user/hdfs/.sparkStaging/application_1565610088533_0087/hbase-protocol-1.2.8.jar" } size: 4366252 timestamp: 1565697198023 type: FILE visibility: PRIVATE __spark_libs__ -> resource { scheme: "hdfs" host: "CID-042fb939-95b4-4b74-91b8-9f94b999bdf7" port: -1 file: "/hdp/apps/2.6.5.0-292/spark2/spark2-hdp-yarn-archive.tar.gz" } size: 227600110 timestamp: 1549953820247 type: ARCHIVE visibility: PUBLIC mysql-connector-java-5.1.44-bin.jar -> resource { scheme: "hdfs" host: "CID-042fb939-95b4-4b74-91b8-9f94b999bdf7" port: -1 file: "/user/hdfs/.sparkStaging/application_1565610088533_0087/mysql-connector-java-5.1.44-bin.jar" } size: 999635 timestamp: 1565697198445 type: FILE visibility: PRIVATE hadoop-common-2.7.3.jar -> resource { scheme: "hdfs" host: "CID-042fb939-95b4-4b74-91b8-9f94b999bdf7" port: -1 file: "/user/hdfs/.sparkStaging/application_1565610088533_0087/hadoop-common-2.7.3.jar" } size: 3479293 timestamp: 1565697197476 type: FILE visibility: PRIVATE =============================================================================== 19/08/13 19:53:20 INFO RMProxy: Connecting to ResourceManager at namenode02/10.1.38.38:8030 19/08/13 19:53:20 INFO YarnRMClient: Registering the ApplicationMaster 19/08/13 19:53:20 INFO YarnAllocator: Will request 3 executor container(s), each with 2 core(s) and 5632 MB memory (including 512 MB of overhead) 19/08/13 19:53:20 INFO YarnSchedulerBackend$YarnSchedulerEndpoint: ApplicationMaster registered as NettyRpcEndpointRef(spark://YarnAM@datanode02:20410) 19/08/13 19:53:20 INFO YarnAllocator: Submitted 3 unlocalized container requests. 19/08/13 19:53:20 INFO ApplicationMaster: Started progress reporter thread with (heartbeat : 3000, initial allocation : 200) intervals 19/08/13 19:53:20 INFO AMRMClientImpl: Received new token for : datanode03:45454 19/08/13 19:53:21 INFO YarnAllocator: Launching container container_e20_1565610088533_0087_01_000002 on host datanode03 for executor with ID 1 19/08/13 19:53:21 INFO YarnAllocator: Received 1 containers from YARN, launching executors on 1 of them. 19/08/13 19:53:21 INFO ContainerManagementProtocolProxy: yarn.client.max-cached-nodemanagers-proxies : 0 19/08/13 19:53:21 INFO ContainerManagementProtocolProxy: Opening proxy : datanode03:45454 19/08/13 19:53:21 INFO AMRMClientImpl: Received new token for : datanode01:45454 19/08/13 19:53:21 INFO YarnAllocator: Launching container container_e20_1565610088533_0087_01_000003 on host datanode01 for executor with ID 2 19/08/13 19:53:21 INFO YarnAllocator: Received 1 containers from YARN, launching executors on 1 of them. 19/08/13 19:53:21 INFO ContainerManagementProtocolProxy: yarn.client.max-cached-nodemanagers-proxies : 0 19/08/13 19:53:21 INFO ContainerManagementProtocolProxy: Opening proxy : datanode01:45454 19/08/13 19:53:22 INFO AMRMClientImpl: Received new token for : datanode02:45454 19/08/13 19:53:22 INFO YarnAllocator: Launching container container_e20_1565610088533_0087_01_000004 on host datanode02 for executor with ID 3 19/08/13 19:53:22 INFO YarnAllocator: Received 1 containers from YARN, launching executors on 1 of them. 19/08/13 19:53:22 INFO ContainerManagementProtocolProxy: yarn.client.max-cached-nodemanagers-proxies : 0 19/08/13 19:53:22 INFO ContainerManagementProtocolProxy: Opening proxy : datanode02:45454 19/08/13 19:53:24 INFO YarnSchedulerBackend$YarnDriverEndpoint: Registered executor NettyRpcEndpointRef(spark-client://Executor) (10.1.198.144:41122) with ID 1 19/08/13 19:53:25 INFO YarnSchedulerBackend$YarnDriverEndpoint: Registered executor NettyRpcEndpointRef(spark-client://Executor) (10.1.229.163:24656) with ID 3 19/08/13 19:53:25 INFO BlockManagerMasterEndpoint: Registering block manager datanode03:3328 with 2.5 GB RAM, BlockManagerId(1, datanode03, 3328, None) 19/08/13 19:53:25 INFO BlockManagerMasterEndpoint: Registering block manager datanode02:28863 with 2.5 GB RAM, BlockManagerId(3, datanode02, 28863, None) 19/08/13 19:53:25 INFO YarnSchedulerBackend$YarnDriverEndpoint: Registered executor NettyRpcEndpointRef(spark-client://Executor) (10.1.229.158:64276) with ID 2 19/08/13 19:53:25 INFO YarnClusterSchedulerBackend: SchedulerBackend is ready for scheduling beginning after reached minRegisteredResourcesRatio: 0.8 19/08/13 19:53:25 INFO YarnClusterScheduler: YarnClusterScheduler.postStartHook done 19/08/13 19:53:25 INFO BlockManagerMasterEndpoint: Registering block manager datanode01:20487 with 2.5 GB RAM, BlockManagerId(2, datanode01, 20487, None) 19/08/13 19:53:25 WARN SparkContext: Using an existing SparkContext; some configuration may not take effect. 19/08/13 19:53:25 INFO SparkContext: Starting job: start at VoiceApplication2.java:128 19/08/13 19:53:25 INFO DAGScheduler: Registering RDD 1 (start at VoiceApplication2.java:128) 19/08/13 19:53:25 INFO DAGScheduler: Got job 0 (start at VoiceApplication2.java:128) with 20 output partitions 19/08/13 19:53:25 INFO DAGScheduler: Final stage: ResultStage 1 (start at VoiceApplication2.java:128) 19/08/13 19:53:25 INFO DAGScheduler: Parents of final stage: List(ShuffleMapStage 0) 19/08/13 19:53:25 INFO DAGScheduler: Missing parents: List(ShuffleMapStage 0) 19/08/13 19:53:26 INFO DAGScheduler: Submitting ShuffleMapStage 0 (MapPartitionsRDD[1] at start at VoiceApplication2.java:128), which has no missing parents 19/08/13 19:53:26 INFO MemoryStore: Block broadcast_0 stored as values in memory (estimated size 3.1 KB, free 366.3 MB) 19/08/13 19:53:26 INFO MemoryStore: Block broadcast_0_piece0 stored as bytes in memory (estimated size 2011.0 B, free 366.3 MB) 19/08/13 19:53:26 INFO BlockManagerInfo: Added broadcast_0_piece0 in memory on datanode02:31984 (size: 2011.0 B, free: 366.3 MB) 19/08/13 19:53:26 INFO SparkContext: Created broadcast 0 from broadcast at DAGScheduler.scala:1039 19/08/13 19:53:26 INFO DAGScheduler: Submitting 50 missing tasks from ShuffleMapStage 0 (MapPartitionsRDD[1] at start at VoiceApplication2.java:128) (first 15 tasks are for partitions Vector(0, 1, 2, 3, 4, 5, 6, 7, 8, 9, 10, 11, 12, 13, 14)) 19/08/13 19:53:26 INFO YarnClusterScheduler: Adding task set 0.0 with 50 tasks 19/08/13 19:53:26 INFO TaskSetManager: Starting task 0.0 in stage 0.0 (TID 0, datanode02, executor 3, partition 0, PROCESS_LOCAL, 7831 bytes) 19/08/13 19:53:26 INFO TaskSetManager: Starting task 1.0 in stage 0.0 (TID 1, datanode03, executor 1, partition 1, PROCESS_LOCAL, 7831 bytes) 19/08/13 19:53:26 INFO TaskSetManager: Starting task 2.0 in stage 0.0 (TID 2, datanode01, executor 2, partition 2, PROCESS_LOCAL, 7831 bytes) 19/08/13 19:53:26 INFO TaskSetManager: Starting task 3.0 in stage 0.0 (TID 3, datanode02, executor 3, partition 3, PROCESS_LOCAL, 7831 bytes) 19/08/13 19:53:26 INFO TaskSetManager: Starting task 4.0 in stage 0.0 (TID 4, datanode03, executor 1, partition 4, PROCESS_LOCAL, 7831 bytes) 19/08/13 19:53:26 INFO TaskSetManager: Starting task 5.0 in stage 0.0 (TID 5, datanode01, executor 2, partition 5, PROCESS_LOCAL, 7831 bytes) 19/08/13 19:53:26 INFO BlockManagerInfo: Added broadcast_0_piece0 in memory on datanode02:28863 (size: 2011.0 B, free: 2.5 GB) 19/08/13 19:53:26 INFO BlockManagerInfo: Added broadcast_0_piece0 in memory on datanode03:3328 (size: 2011.0 B, free: 2.5 GB) 19/08/13 19:53:26 INFO BlockManagerInfo: Added broadcast_0_piece0 in memory on datanode01:20487 (size: 2011.0 B, free: 2.5 GB) 19/08/13 19:53:26 INFO TaskSetManager: Starting task 6.0 in stage 0.0 (TID 6, datanode02, executor 3, partition 6, PROCESS_LOCAL, 7831 bytes) 19/08/13 19:53:26 INFO TaskSetManager: Starting task 7.0 in stage 0.0 (TID 7, datanode02, executor 3, partition 7, PROCESS_LOCAL, 7831 bytes) 19/08/13 19:53:26 INFO TaskSetManager: Finished task 3.0 in stage 0.0 (TID 3) in 693 ms on datanode02 (executor 3) (1/50) 19/08/13 19:53:26 INFO TaskSetManager: Finished task 0.0 in stage 0.0 (TID 0) in 712 ms on datanode02 (executor 3) (2/50) 19/08/13 19:53:26 INFO TaskSetManager: Starting task 8.0 in stage 0.0 (TID 8, datanode02, executor 3, partition 8, PROCESS_LOCAL, 7831 bytes) 19/08/13 19:53:26 INFO TaskSetManager: Finished task 7.0 in stage 0.0 (TID 7) in 21 ms on datanode02 (executor 3) (3/50) 19/08/13 19:53:26 INFO TaskSetManager: Starting task 9.0 in stage 0.0 (TID 9, datanode02, executor 3, partition 9, PROCESS_LOCAL, 7831 bytes) 19/08/13 19:53:26 INFO TaskSetManager: Finished task 6.0 in stage 0.0 (TID 6) in 26 ms on datanode02 (executor 3) (4/50) 19/08/13 19:53:26 INFO TaskSetManager: Starting task 10.0 in stage 0.0 (TID 10, datanode02, executor 3, partition 10, PROCESS_LOCAL, 7831 bytes) 19/08/13 19:53:26 INFO TaskSetManager: Finished task 8.0 in stage 0.0 (TID 8) in 23 ms on datanode02 (executor 3) (5/50) 19/08/13 19:53:26 INFO TaskSetManager: Starting task 11.0 in stage 0.0 (TID 11, datanode02, executor 3, partition 11, PROCESS_LOCAL, 7831 bytes) 19/08/13 19:53:26 INFO TaskSetManager: Finished task 9.0 in stage 0.0 (TID 9) in 25 ms on datanode02 (executor 3) (6/50) 19/08/13 19:53:26 INFO TaskSetManager: Starting task 12.0 in stage 0.0 (TID 12, datanode02, executor 3, partition 12, PROCESS_LOCAL, 7831 bytes) 19/08/13 19:53:26 INFO TaskSetManager: Finished task 10.0 in stage 0.0 (TID 10) in 18 ms on datanode02 (executor 3) (7/50) 19/08/13 19:53:26 INFO TaskSetManager: Finished task 11.0 in stage 0.0 (TID 11) in 14 ms on datanode02 (executor 3) (8/50) 19/08/13 19:53:26 INFO TaskSetManager: Starting task 13.0 in stage 0.0 (TID 13, datanode02, executor 3, partition 13, PROCESS_LOCAL, 7831 bytes) 19/08/13 19:53:26 INFO TaskSetManager: Starting task 14.0 in stage 0.0 (TID 14, datanode02, executor 3, partition 14, PROCESS_LOCAL, 7831 bytes) 19/08/13 19:53:26 INFO TaskSetManager: Finished task 12.0 in stage 0.0 (TID 12) in 16 ms on datanode02 (executor 3) (9/50) 19/08/13 19:53:26 INFO TaskSetManager: Starting task 15.0 in stage 0.0 (TID 15, datanode02, executor 3, partition 15, PROCESS_LOCAL, 7831 bytes) 19/08/13 19:53:26 INFO TaskSetManager: Finished task 13.0 in stage 0.0 (TID 13) in 22 ms on datanode02 (executor 3) (10/50) 19/08/13 19:53:26 INFO TaskSetManager: Starting task 16.0 in stage 0.0 (TID 16, datanode02, executor 3, partition 16, PROCESS_LOCAL, 7831 bytes) 19/08/13 19:53:26 INFO TaskSetManager: Finished task 14.0 in stage 0.0 (TID 14) in 16 ms on datanode02 (executor 3) (11/50) 19/08/13 19:53:26 INFO TaskSetManager: Starting task 17.0 in stage 0.0 (TID 17, datanode02, executor 3, partition 17, PROCESS_LOCAL, 7831 bytes) 19/08/13 19:53:26 INFO TaskSetManager: Finished task 15.0 in stage 0.0 (TID 15) in 13 ms on datanode02 (executor 3) (12/50) 19/08/13 19:53:26 INFO TaskSetManager: Starting task 18.0 in stage 0.0 (TID 18, datanode01, executor 2, partition 18, PROCESS_LOCAL, 7831 bytes) 19/08/13 19:53:26 INFO TaskSetManager: Starting task 19.0 in stage 0.0 (TID 19, datanode01, executor 2, partition 19, PROCESS_LOCAL, 7831 bytes) 19/08/13 19:53:26 INFO TaskSetManager: Finished task 5.0 in stage 0.0 (TID 5) in 787 ms on datanode01 (executor 2) (13/50) 19/08/13 19:53:26 INFO TaskSetManager: Finished task 2.0 in stage 0.0 (TID 2) in 789 ms on datanode01 (executor 2) (14/50) 19/08/13 19:53:26 INFO TaskSetManager: Starting task 20.0 in stage 0.0 (TID 20, datanode03, executor 1, partition 20, PROCESS_LOCAL, 7831 bytes) 19/08/13 19:53:26 INFO TaskSetManager: Starting task 21.0 in stage 0.0 (TID 21, datanode03, executor 1, partition 21, PROCESS_LOCAL, 7831 bytes) 19/08/13 19:53:27 INFO TaskSetManager: Finished task 4.0 in stage 0.0 (TID 4) in 905 ms on datanode03 (executor 1) (15/50) 19/08/13 19:53:27 INFO TaskSetManager: Finished task 1.0 in stage 0.0 (TID 1) in 907 ms on datanode03 (executor 1) (16/50) 19/08/13 19:53:27 INFO TaskSetManager: Starting task 22.0 in stage 0.0 (TID 22, datanode02, executor 3, partition 22, PROCESS_LOCAL, 7831 bytes) 19/08/13 19:53:27 INFO TaskSetManager: Starting task 23.0 in stage 0.0 (TID 23, datanode02, executor 3, partition 23, PROCESS_LOCAL, 7831 bytes) 19/08/13 19:53:27 INFO TaskSetManager: Starting task 24.0 in stage 0.0 (TID 24, datanode01, executor 2, partition 24, PROCESS_LOCAL, 7831 bytes) 19/08/13 19:53:27 INFO TaskSetManager: Finished task 18.0 in stage 0.0 (TID 18) in 124 ms on datanode01 (executor 2) (17/50) 19/08/13 19:53:27 INFO TaskSetManager: Finished task 16.0 in stage 0.0 (TID 16) in 134 ms on datanode02 (executor 3) (18/50) 19/08/13 19:53:27 INFO TaskSetManager: Starting task 25.0 in stage 0.0 (TID 25, datanode01, executor 2, partition 25, PROCESS_LOCAL, 7831 bytes) 19/08/13 19:53:27 INFO TaskSetManager: Starting task 26.0 in stage 0.0 (TID 26, datanode03, executor 1, partition 26, PROCESS_LOCAL, 7831 bytes) 19/08/13 19:53:27 INFO TaskSetManager: Finished task 17.0 in stage 0.0 (TID 17) in 134 ms on datanode02 (executor 3) (19/50) 19/08/13 19:53:27 INFO TaskSetManager: Finished task 20.0 in stage 0.0 (TID 20) in 122 ms on datanode03 (executor 1) (20/50) 19/08/13 19:53:27 INFO TaskSetManager: Starting task 27.0 in stage 0.0 (TID 27, datanode03, executor 1, partition 27, PROCESS_LOCAL, 7831 bytes) 19/08/13 19:53:27 INFO TaskSetManager: Finished task 19.0 in stage 0.0 (TID 19) in 127 ms on datanode01 (executor 2) (21/50) 19/08/13 19:53:27 INFO TaskSetManager: Finished task 21.0 in stage 0.0 (TID 21) in 123 ms on datanode03 (executor 1) (22/50) 19/08/13 19:53:27 INFO TaskSetManager: Starting task 28.0 in stage 0.0 (TID 28, datanode02, executor 3, partition 28, PROCESS_LOCAL, 7831 bytes) 19/08/13 19:53:27 INFO TaskSetManager: Starting task 29.0 in stage 0.0 (TID 29, datanode02, executor 3, partition 29, PROCESS_LOCAL, 7831 bytes) 19/08/13 19:53:27 INFO TaskSetManager: Finished task 22.0 in stage 0.0 (TID 22) in 19 ms on datanode02 (executor 3) (23/50) 19/08/13 19:53:27 INFO TaskSetManager: Finished task 23.0 in stage 0.0 (TID 23) in 18 ms on datanode02 (executor 3) (24/50) 19/08/13 19:53:27 INFO TaskSetManager: Starting task 30.0 in stage 0.0 (TID 30, datanode01, executor 2, partition 30, PROCESS_LOCAL, 7831 bytes) 19/08/13 19:53:27 INFO TaskSetManager: Starting task 31.0 in stage 0.0 (TID 31, datanode01, executor 2, partition 31, PROCESS_LOCAL, 7831 bytes) 19/08/13 19:53:27 INFO TaskSetManager: Finished task 25.0 in stage 0.0 (TID 25) in 27 ms on datanode01 (executor 2) (25/50) 19/08/13 19:53:27 INFO TaskSetManager: Finished task 24.0 in stage 0.0 (TID 24) in 29 ms on datanode01 (executor 2) (26/50) 19/08/13 19:53:27 INFO TaskSetManager: Starting task 32.0 in stage 0.0 (TID 32, datanode02, executor 3, partition 32, PROCESS_LOCAL, 7831 bytes) 19/08/13 19:53:27 INFO TaskSetManager: Finished task 29.0 in stage 0.0 (TID 29) in 16 ms on datanode02 (executor 3) (27/50) 19/08/13 19:53:27 INFO TaskSetManager: Starting task 33.0 in stage 0.0 (TID 33, datanode03, executor 1, partition 33, PROCESS_LOCAL, 7831 bytes) 19/08/13 19:53:27 INFO TaskSetManager: Finished task 26.0 in stage 0.0 (TID 26) in 30 ms on datanode03 (executor 1) (28/50) 19/08/13 19:53:27 INFO TaskSetManager: Starting task 34.0 in stage 0.0 (TID 34, datanode02, executor 3, partition 34, PROCESS_LOCAL, 7831 bytes) 19/08/13 19:53:27 INFO TaskSetManager: Finished task 28.0 in stage 0.0 (TID 28) in 21 ms on datanode02 (executor 3) (29/50) 19/08/13 19:53:27 INFO TaskSetManager: Starting task 35.0 in stage 0.0 (TID 35, datanode03, executor 1, partition 35, PROCESS_LOCAL, 7831 bytes) 19/08/13 19:53:27 INFO TaskSetManager: Finished task 27.0 in stage 0.0 (TID 27) in 32 ms on datanode03 (executor 1) (30/50) 19/08/13 19:53:27 INFO TaskSetManager: Starting task 36.0 in stage 0.0 (TID 36, datanode02, executor 3, partition 36, PROCESS_LOCAL, 7831 bytes) 19/08/13 19:53:27 INFO TaskSetManager: Finished task 32.0 in stage 0.0 (TID 32) in 11 ms on datanode02 (executor 3) (31/50) 19/08/13 19:53:27 INFO TaskSetManager: Starting task 37.0 in stage 0.0 (TID 37, datanode01, executor 2, partition 37, PROCESS_LOCAL, 7831 bytes) 19/08/13 19:53:27 INFO TaskSetManager: Finished task 30.0 in stage 0.0 (TID 30) in 18 ms on datanode01 (executor 2) (32/50) 19/08/13 19:53:27 INFO TaskSetManager: Starting task 38.0 in stage 0.0 (TID 38, datanode01, executor 2, partition 38, PROCESS_LOCAL, 7831 bytes) 19/08/13 19:53:27 INFO TaskSetManager: Finished task 31.0 in stage 0.0 (TID 31) in 20 ms on datanode01 (executor 2) (33/50) 19/08/13 19:53:27 INFO TaskSetManager: Starting task 39.0 in stage 0.0 (TID 39, datanode03, executor 1, partition 39, PROCESS_LOCAL, 7831 bytes) 19/08/13 19:53:27 INFO TaskSetManager: Finished task 33.0 in stage 0.0 (TID 33) in 17 ms on datanode03 (executor 1) (34/50) 19/08/13 19:53:27 INFO TaskSetManager: Finished task 34.0 in stage 0.0 (TID 34) in 17 ms on datanode02 (executor 3) (35/50) 19/08/13 19:53:27 INFO TaskSetManager: Starting task 40.0 in stage 0.0 (TID 40, datanode02, executor 3, partition 40, PROCESS_LOCAL, 7831 bytes) 19/08/13 19:53:27 INFO TaskSetManager: Starting task 41.0 in stage 0.0 (TID 41, datanode03, executor 1, partition 41, PROCESS_LOCAL, 7831 bytes) 19/08/13 19:53:27 INFO TaskSetManager: Finished task 35.0 in stage 0.0 (TID 35) in 17 ms on datanode03 (executor 1) (36/50) 19/08/13 19:53:27 INFO TaskSetManager: Starting task 42.0 in stage 0.0 (TID 42, datanode02, executor 3, partition 42, PROCESS_LOCAL, 7831 bytes) 19/08/13 19:53:27 INFO TaskSetManager: Finished task 36.0 in stage 0.0 (TID 36) in 16 ms on datanode02 (executor 3) (37/50) 19/08/13 19:53:27 INFO TaskSetManager: Starting task 43.0 in stage 0.0 (TID 43, datanode01, executor 2, partition 43, PROCESS_LOCAL, 7831 bytes) 19/08/13 19:53:27 INFO TaskSetManager: Finished task 37.0 in stage 0.0 (TID 37) in 16 ms on datanode01 (executor 2) (38/50) 19/08/13 19:53:27 INFO TaskSetManager: Starting task 44.0 in stage 0.0 (TID 44, datanode02, executor 3, partition 44, PROCESS_LOCAL, 7831 bytes) 19/08/13 19:53:27 INFO TaskSetManager: Starting task 45.0 in stage 0.0 (TID 45, datanode02, executor 3, partition 45, PROCESS_LOCAL, 7831 bytes) 19/08/13 19:53:27 INFO TaskSetManager: Finished task 40.0 in stage 0.0 (TID 40) in 14 ms on datanode02 (executor 3) (39/50) 19/08/13 19:53:27 INFO TaskSetManager: Finished task 42.0 in stage 0.0 (TID 42) in 11 ms on datanode02 (executor 3) (40/50) 19/08/13 19:53:27 INFO TaskSetManager: Starting task 46.0 in stage 0.0 (TID 46, datanode03, executor 1, partition 46, PROCESS_LOCAL, 7831 bytes) 19/08/13 19:53:27 INFO TaskSetManager: Finished task 39.0 in stage 0.0 (TID 39) in 20 ms on datanode03 (executor 1) (41/50) 19/08/13 19:53:27 INFO TaskSetManager: Starting task 47.0 in stage 0.0 (TID 47, datanode03, executor 1, partition 47, PROCESS_LOCAL, 7831 bytes) 19/08/13 19:53:27 INFO TaskSetManager: Finished task 41.0 in stage 0.0 (TID 41) in 20 ms on datanode03 (executor 1) (42/50) 19/08/13 19:53:27 INFO TaskSetManager: Starting task 48.0 in stage 0.0 (TID 48, datanode01, executor 2, partition 48, PROCESS_LOCAL, 7831 bytes) 19/08/13 19:53:27 INFO TaskSetManager: Starting task 49.0 in stage 0.0 (TID 49, datanode01, executor 2, partition 49, PROCESS_LOCAL, 7888 bytes) 19/08/13 19:53:27 INFO TaskSetManager: Finished task 43.0 in stage 0.0 (TID 43) in 18 ms on datanode01 (executor 2) (43/50) 19/08/13 19:53:27 INFO TaskSetManager: Finished task 38.0 in stage 0.0 (TID 38) in 31 ms on datanode01 (executor 2) (44/50) 19/08/13 19:53:27 INFO TaskSetManager: Finished task 45.0 in stage 0.0 (TID 45) in 11 ms on datanode02 (executor 3) (45/50) 19/08/13 19:53:27 INFO TaskSetManager: Finished task 44.0 in stage 0.0 (TID 44) in 16 ms on datanode02 (executor 3) (46/50) 19/08/13 19:53:27 INFO TaskSetManager: Finished task 46.0 in stage 0.0 (TID 46) in 18 ms on datanode03 (executor 1) (47/50) 19/08/13 19:53:27 INFO TaskSetManager: Finished task 48.0 in stage 0.0 (TID 48) in 15 ms on datanode01 (executor 2) (48/50) 19/08/13 19:53:27 INFO TaskSetManager: Finished task 47.0 in stage 0.0 (TID 47) in 15 ms on datanode03 (executor 1) (49/50) 19/08/13 19:53:27 INFO TaskSetManager: Finished task 49.0 in stage 0.0 (TID 49) in 25 ms on datanode01 (executor 2) (50/50) 19/08/13 19:53:27 INFO YarnClusterScheduler: Removed TaskSet 0.0, whose tasks have all completed, from pool 19/08/13 19:53:27 INFO DAGScheduler: ShuffleMapStage 0 (start at VoiceApplication2.java:128) finished in 1.174 s 19/08/13 19:53:27 INFO DAGScheduler: looking for newly runnable stages 19/08/13 19:53:27 INFO DAGScheduler: running: Set() 19/08/13 19:53:27 INFO DAGScheduler: waiting: Set(ResultStage 1) 19/08/13 19:53:27 INFO DAGScheduler: failed: Set() 19/08/13 19:53:27 INFO DAGScheduler: Submitting ResultStage 1 (ShuffledRDD[2] at start at VoiceApplication2.java:128), which has no missing parents 19/08/13 19:53:27 INFO MemoryStore: Block broadcast_1 stored as values in memory (estimated size 3.2 KB, free 366.3 MB) 19/08/13 19:53:27 INFO MemoryStore: Block broadcast_1_piece0 stored as bytes in memory (estimated size 1979.0 B, free 366.3 MB) 19/08/13 19:53:27 INFO BlockManagerInfo: Added broadcast_1_piece0 in memory on datanode02:31984 (size: 1979.0 B, free: 366.3 MB) 19/08/13 19:53:27 INFO SparkContext: Created broadcast 1 from broadcast at DAGScheduler.scala:1039 19/08/13 19:53:27 INFO DAGScheduler: Submitting 20 missing tasks from ResultStage 1 (ShuffledRDD[2] at start at VoiceApplication2.java:128) (first 15 tasks are for partitions Vector(0, 1, 2, 3, 4, 5, 6, 7, 8, 9, 10, 11, 12, 13, 14)) 19/08/13 19:53:27 INFO YarnClusterScheduler: Adding task set 1.0 with 20 tasks 19/08/13 19:53:27 INFO TaskSetManager: Starting task 0.0 in stage 1.0 (TID 50, datanode03, executor 1, partition 0, NODE_LOCAL, 7638 bytes) 19/08/13 19:53:27 INFO TaskSetManager: Starting task 1.0 in stage 1.0 (TID 51, datanode02, executor 3, partition 1, NODE_LOCAL, 7638 bytes) 19/08/13 19:53:27 INFO TaskSetManager: Starting task 3.0 in stage 1.0 (TID 52, datanode01, executor 2, partition 3, NODE_LOCAL, 7638 bytes) 19/08/13 19:53:27 INFO TaskSetManager: Starting task 2.0 in stage 1.0 (TID 53, datanode03, executor 1, partition 2, NODE_LOCAL, 7638 bytes) 19/08/13 19:53:27 INFO TaskSetManager: Starting task 4.0 in stage 1.0 (TID 54, datanode02, executor 3, partition 4, NODE_LOCAL, 7638 bytes) 19/08/13 19:53:27 INFO TaskSetManager: Starting task 5.0 in stage 1.0 (TID 55, datanode01, executor 2, partition 5, NODE_LOCAL, 7638 bytes) 19/08/13 19:53:27 INFO BlockManagerInfo: Added broadcast_1_piece0 in memory on datanode02:28863 (size: 1979.0 B, free: 2.5 GB) 19/08/13 19:53:27 INFO BlockManagerInfo: Added broadcast_1_piece0 in memory on datanode01:20487 (size: 1979.0 B, free: 2.5 GB) 19/08/13 19:53:27 INFO BlockManagerInfo: Added broadcast_1_piece0 in memory on datanode03:3328 (size: 1979.0 B, free: 2.5 GB) 19/08/13 19:53:27 INFO MapOutputTrackerMasterEndpoint: Asked to send map output locations for shuffle 0 to 10.1.229.163:24656 19/08/13 19:53:27 INFO MapOutputTrackerMasterEndpoint: Asked to send map output locations for shuffle 0 to 10.1.198.144:41122 19/08/13 19:53:27 INFO MapOutputTrackerMasterEndpoint: Asked to send map output locations for shuffle 0 to 10.1.229.158:64276 19/08/13 19:53:27 INFO TaskSetManager: Starting task 7.0 in stage 1.0 (TID 56, datanode03, executor 1, partition 7, NODE_LOCAL, 7638 bytes) 19/08/13 19:53:27 INFO TaskSetManager: Finished task 2.0 in stage 1.0 (TID 53) in 192 ms on datanode03 (executor 1) (1/20) 19/08/13 19:53:27 INFO TaskSetManager: Starting task 8.0 in stage 1.0 (TID 57, datanode03, executor 1, partition 8, NODE_LOCAL, 7638 bytes) 19/08/13 19:53:27 INFO TaskSetManager: Finished task 7.0 in stage 1.0 (TID 56) in 25 ms on datanode03 (executor 1) (2/20) 19/08/13 19:53:27 INFO TaskSetManager: Starting task 6.0 in stage 1.0 (TID 58, datanode02, executor 3, partition 6, NODE_LOCAL, 7638 bytes) 19/08/13 19:53:27 INFO TaskSetManager: Finished task 1.0 in stage 1.0 (TID 51) in 220 ms on datanode02 (executor 3) (3/20) 19/08/13 19:53:27 INFO TaskSetManager: Starting task 14.0 in stage 1.0 (TID 59, datanode03, executor 1, partition 14, NODE_LOCAL, 7638 bytes) 19/08/13 19:53:27 INFO TaskSetManager: Finished task 8.0 in stage 1.0 (TID 57) in 17 ms on datanode03 (executor 1) (4/20) 19/08/13 19:53:27 INFO TaskSetManager: Starting task 16.0 in stage 1.0 (TID 60, datanode03, executor 1, partition 16, NODE_LOCAL, 7638 bytes) 19/08/13 19:53:27 INFO TaskSetManager: Finished task 14.0 in stage 1.0 (TID 59) in 15 ms on datanode03 (executor 1) (5/20) 19/08/13 19:53:27 INFO TaskSetManager: Finished task 16.0 in stage 1.0 (TID 60) in 21 ms on datanode03 (executor 1) (6/20) 19/08/13 19:53:27 INFO TaskSetManager: Starting task 9.0 in stage 1.0 (TID 61, datanode02, executor 3, partition 9, NODE_LOCAL, 7638 bytes) 19/08/13 19:53:27 INFO TaskSetManager: Finished task 4.0 in stage 1.0 (TID 54) in 269 ms on datanode02 (executor 3) (7/20) 19/08/13 19:53:27 INFO TaskSetManager: Finished task 0.0 in stage 1.0 (TID 50) in 339 ms on datanode03 (executor 1) (8/20) 19/08/13 19:53:27 INFO TaskSetManager: Starting task 10.0 in stage 1.0 (TID 62, datanode02, executor 3, partition 10, NODE_LOCAL, 7638 bytes) 19/08/13 19:53:27 INFO TaskSetManager: Finished task 6.0 in stage 1.0 (TID 58) in 56 ms on datanode02 (executor 3) (9/20) 19/08/13 19:53:27 INFO TaskSetManager: Starting task 11.0 in stage 1.0 (TID 63, datanode01, executor 2, partition 11, NODE_LOCAL, 7638 bytes) 19/08/13 19:53:27 INFO TaskSetManager: Finished task 5.0 in stage 1.0 (TID 55) in 284 ms on datanode01 (executor 2) (10/20) 19/08/13 19:53:27 INFO TaskSetManager: Starting task 12.0 in stage 1.0 (TID 64, datanode01, executor 2, partition 12, NODE_LOCAL, 7638 bytes) 19/08/13 19:53:27 INFO TaskSetManager: Finished task 3.0 in stage 1.0 (TID 52) in 287 ms on datanode01 (executor 2) (11/20) 19/08/13 19:53:27 INFO TaskSetManager: Starting task 13.0 in stage 1.0 (TID 65, datanode02, executor 3, partition 13, NODE_LOCAL, 7638 bytes) 19/08/13 19:53:27 INFO TaskSetManager: Starting task 15.0 in stage 1.0 (TID 66, datanode02, executor 3, partition 15, NODE_LOCAL, 7638 bytes) 19/08/13 19:53:27 INFO TaskSetManager: Finished task 10.0 in stage 1.0 (TID 62) in 25 ms on datanode02 (executor 3) (12/20) 19/08/13 19:53:27 INFO TaskSetManager: Finished task 9.0 in stage 1.0 (TID 61) in 29 ms on datanode02 (executor 3) (13/20) 19/08/13 19:53:27 INFO TaskSetManager: Starting task 17.0 in stage 1.0 (TID 67, datanode02, executor 3, partition 17, NODE_LOCAL, 7638 bytes) 19/08/13 19:53:27 INFO TaskSetManager: Finished task 15.0 in stage 1.0 (TID 66) in 13 ms on datanode02 (executor 3) (14/20) 19/08/13 19:53:27 INFO TaskSetManager: Finished task 13.0 in stage 1.0 (TID 65) in 16 ms on datanode02 (executor 3) (15/20) 19/08/13 19:53:27 INFO TaskSetManager: Starting task 18.0 in stage 1.0 (TID 68, datanode02, executor 3, partition 18, NODE_LOCAL, 7638 bytes) 19/08/13 19:53:27 INFO TaskSetManager: Starting task 19.0 in stage 1.0 (TID 69, datanode01, executor 2, partition 19, NODE_LOCAL, 7638 bytes) 19/08/13 19:53:27 INFO TaskSetManager: Finished task 11.0 in stage 1.0 (TID 63) in 30 ms on datanode01 (executor 2) (16/20) 19/08/13 19:53:27 INFO TaskSetManager: Finished task 12.0 in stage 1.0 (TID 64) in 30 ms on datanode01 (executor 2) (17/20) 19/08/13 19:53:27 INFO TaskSetManager: Finished task 17.0 in stage 1.0 (TID 67) in 17 ms on datanode02 (executor 3) (18/20) 19/08/13 19:53:27 INFO TaskSetManager: Finished task 19.0 in stage 1.0 (TID 69) in 13 ms on datanode01 (executor 2) (19/20) 19/08/13 19:53:27 INFO TaskSetManager: Finished task 18.0 in stage 1.0 (TID 68) in 20 ms on datanode02 (executor 3) (20/20) 19/08/13 19:53:27 INFO YarnClusterScheduler: Removed TaskSet 1.0, whose tasks have all completed, from pool 19/08/13 19:53:27 INFO DAGScheduler: ResultStage 1 (start at VoiceApplication2.java:128) finished in 0.406 s 19/08/13 19:53:27 INFO DAGScheduler: Job 0 finished: start at VoiceApplication2.java:128, took 1.850883 s 19/08/13 19:53:27 INFO ReceiverTracker: Starting 1 receivers 19/08/13 19:53:27 INFO ReceiverTracker: ReceiverTracker started 19/08/13 19:53:27 INFO KafkaInputDStream: Slide time = 60000 ms 19/08/13 19:53:27 INFO KafkaInputDStream: Storage level = Serialized 1x Replicated 19/08/13 19:53:27 INFO KafkaInputDStream: Checkpoint interval = null 19/08/13 19:53:27 INFO KafkaInputDStream: Remember interval = 60000 ms 19/08/13 19:53:27 INFO KafkaInputDStream: Initialized and validated org.apache.spark.streaming.kafka.KafkaInputDStream@5fd3dc81 19/08/13 19:53:27 INFO ForEachDStream: Slide time = 60000 ms 19/08/13 19:53:27 INFO ForEachDStream: Storage level = Serialized 1x Replicated 19/08/13 19:53:27 INFO ForEachDStream: Checkpoint interval = null 19/08/13 19:53:27 INFO ForEachDStream: Remember interval = 60000 ms 19/08/13 19:53:27 INFO ForEachDStream: Initialized and validated org.apache.spark.streaming.dstream.ForEachDStream@4044ec97 19/08/13 19:53:27 INFO KafkaInputDStream: Slide time = 60000 ms 19/08/13 19:53:27 INFO KafkaInputDStream: Storage level = Serialized 1x Replicated 19/08/13 19:53:27 INFO KafkaInputDStream: Checkpoint interval = null 19/08/13 19:53:27 INFO KafkaInputDStream: Remember interval = 60000 ms 19/08/13 19:53:27 INFO KafkaInputDStream: Initialized and validated org.apache.spark.streaming.kafka.KafkaInputDStream@5fd3dc81 19/08/13 19:53:27 INFO MappedDStream: Slide time = 60000 ms 19/08/13 19:53:27 INFO MappedDStream: Storage level = Serialized 1x Replicated 19/08/13 19:53:27 INFO MappedDStream: Checkpoint interval = null 19/08/13 19:53:27 INFO MappedDStream: Remember interval = 60000 ms 19/08/13 19:53:27 INFO MappedDStream: Initialized and validated org.apache.spark.streaming.dstream.MappedDStream@5dd4b960 19/08/13 19:53:27 INFO ForEachDStream: Slide time = 60000 ms 19/08/13 19:53:27 INFO ForEachDStream: Storage level = Serialized 1x Replicated 19/08/13 19:53:27 INFO ForEachDStream: Checkpoint interval = null 19/08/13 19:53:27 INFO ForEachDStream: Remember interval = 60000 ms 19/08/13 19:53:27 INFO ForEachDStream: Initialized and validated org.apache.spark.streaming.dstream.ForEachDStream@132d0c3c 19/08/13 19:53:27 INFO KafkaInputDStream: Slide time = 60000 ms 19/08/13 19:53:27 INFO KafkaInputDStream: Storage level = Serialized 1x Replicated 19/08/13 19:53:27 INFO KafkaInputDStream: Checkpoint interval = null 19/08/13 19:53:27 INFO KafkaInputDStream: Remember interval = 60000 ms 19/08/13 19:53:27 INFO KafkaInputDStream: Initialized and validated org.apache.spark.streaming.kafka.KafkaInputDStream@5fd3dc81 19/08/13 19:53:27 INFO MappedDStream: Slide time = 60000 ms 19/08/13 19:53:27 INFO MappedDStream: Storage level = Serialized 1x Replicated 19/08/13 19:53:27 INFO MappedDStream: Checkpoint interval = null 19/08/13 19:53:27 INFO MappedDStream: Remember interval = 60000 ms 19/08/13 19:53:27 INFO MappedDStream: Initialized and validated org.apache.spark.streaming.dstream.MappedDStream@5dd4b960 19/08/13 19:53:27 INFO ForEachDStream: Slide time = 60000 ms 19/08/13 19:53:27 INFO ForEachDStream: Storage level = Serialized 1x Replicated 19/08/13 19:53:27 INFO ForEachDStream: Checkpoint interval = null 19/08/13 19:53:27 INFO ForEachDStream: Remember interval = 60000 ms 19/08/13 19:53:27 INFO ForEachDStream: Initialized and validated org.apache.spark.streaming.dstream.ForEachDStream@525bed0c 19/08/13 19:53:27 INFO DAGScheduler: Got job 1 (start at VoiceApplication2.java:128) with 1 output partitions 19/08/13 19:53:27 INFO DAGScheduler: Final stage: ResultStage 2 (start at VoiceApplication2.java:128) 19/08/13 19:53:27 INFO DAGScheduler: Parents of final stage: List() 19/08/13 19:53:27 INFO DAGScheduler: Missing parents: List() 19/08/13 19:53:27 INFO DAGScheduler: Submitting ResultStage 2 (Receiver 0 ParallelCollectionRDD[3] at makeRDD at ReceiverTracker.scala:613), which has no missing parents 19/08/13 19:53:27 INFO ReceiverTracker: Receiver 0 started 19/08/13 19:53:27 INFO MemoryStore: Block broadcast_2 stored as values in memory (estimated size 133.5 KB, free 366.2 MB) 19/08/13 19:53:27 INFO MemoryStore: Block broadcast_2_piece0 stored as bytes in memory (estimated size 36.3 KB, free 366.1 MB) 19/08/13 19:53:27 INFO BlockManagerInfo: Added broadcast_2_piece0 in memory on datanode02:31984 (size: 36.3 KB, free: 366.3 MB) 19/08/13 19:53:27 INFO SparkContext: Created broadcast 2 from broadcast at DAGScheduler.scala:1039 19/08/13 19:53:27 INFO DAGScheduler: Submitting 1 missing tasks from ResultStage 2 (Receiver 0 ParallelCollectionRDD[3] at makeRDD at ReceiverTracker.scala:613) (first 15 tasks are for partitions Vector(0)) 19/08/13 19:53:27 INFO YarnClusterScheduler: Adding task set 2.0 with 1 tasks 19/08/13 19:53:27 INFO TaskSetManager: Starting task 0.0 in stage 2.0 (TID 70, datanode01, executor 2, partition 0, PROCESS_LOCAL, 8757 bytes) 19/08/13 19:53:27 INFO RecurringTimer: Started timer for JobGenerator at time 1565697240000 19/08/13 19:53:27 INFO JobGenerator: Started JobGenerator at 1565697240000 ms 19/08/13 19:53:27 INFO JobScheduler: Started JobScheduler 19/08/13 19:53:27 INFO StreamingContext: StreamingContext started 19/08/13 19:53:27 INFO BlockManagerInfo: Added broadcast_2_piece0 in memory on datanode01:20487 (size: 36.3 KB, free: 2.5 GB) 19/08/13 19:53:27 INFO ReceiverTracker: Registered receiver for stream 0 from 10.1.229.158:64276 19/08/13 19:54:00 INFO JobScheduler: Added jobs for time 1565697240000 ms 19/08/13 19:54:00 INFO JobScheduler: Starting job streaming job 1565697240000 ms.0 from job set of time 1565697240000 ms 19/08/13 19:54:00 INFO JobScheduler: Starting job streaming job 1565697240000 ms.1 from job set of time 1565697240000 ms 19/08/13 19:54:00 INFO JobScheduler: Finished job streaming job 1565697240000 ms.1 from job set of time 1565697240000 ms 19/08/13 19:54:00 INFO JobScheduler: Finished job streaming job 1565697240000 ms.0 from job set of time 1565697240000 ms 19/08/13 19:54:00 INFO JobScheduler: Starting job streaming job 1565697240000 ms.2 from job set of time 1565697240000 ms 19/08/13 19:54:00 INFO SharedState: loading hive config file: file:/data01/hadoop/yarn/local/usercache/hdfs/filecache/85431/__spark_conf__.zip/__hadoop_conf__/hive-site.xml 19/08/13 19:54:00 INFO SharedState: Setting hive.metastore.warehouse.dir ('null') to the value of spark.sql.warehouse.dir ('hdfs://CID-042fb939-95b4-4b74-91b8-9f94b999bdf7/apps/hive/warehouse'). 19/08/13 19:54:00 INFO SharedState: Warehouse path is 'hdfs://CID-042fb939-95b4-4b74-91b8-9f94b999bdf7/apps/hive/warehouse'. 19/08/13 19:54:00 INFO StateStoreCoordinatorRef: Registered StateStoreCoordinator endpoint 19/08/13 19:54:00 INFO BlockManagerInfo: Removed broadcast_1_piece0 on datanode02:31984 in memory (size: 1979.0 B, free: 366.3 MB) 19/08/13 19:54:00 INFO BlockManagerInfo: Removed broadcast_1_piece0 on datanode02:28863 in memory (size: 1979.0 B, free: 2.5 GB) 19/08/13 19:54:00 INFO BlockManagerInfo: Removed broadcast_1_piece0 on datanode01:20487 in memory (size: 1979.0 B, free: 2.5 GB) 19/08/13 19:54:00 INFO BlockManagerInfo: Removed broadcast_1_piece0 on datanode03:3328 in memory (size: 1979.0 B, free: 2.5 GB) 19/08/13 19:54:02 INFO CodeGenerator: Code generated in 175.416957 ms 19/08/13 19:54:02 INFO JobScheduler: Finished job streaming job 1565697240000 ms.2 from job set of time 1565697240000 ms 19/08/13 19:54:02 ERROR JobScheduler: Error running job streaming job 1565697240000 ms.2 org.apache.spark.sql.catalyst.analysis.NoSuchDatabaseException: Database 'meta_voice' not found; at org.apache.spark.sql.catalyst.catalog.ExternalCatalog.requireDbExists(ExternalCatalog.scala:40) at org.apache.spark.sql.catalyst.catalog.InMemoryCatalog.tableExists(InMemoryCatalog.scala:331) at org.apache.spark.sql.catalyst.catalog.SessionCatalog.tableExists(SessionCatalog.scala:388) at org.apache.spark.sql.DataFrameWriter.saveAsTable(DataFrameWriter.scala:398) at org.apache.spark.sql.DataFrameWriter.saveAsTable(DataFrameWriter.scala:393) at com.stream.VoiceApplication2$2.call(VoiceApplication2.java:122) at com.stream.VoiceApplication2$2.call(VoiceApplication2.java:115) at org.apache.spark.streaming.api.java.JavaDStreamLike$$anonfun$foreachRDD$2.apply(JavaDStreamLike.scala:280) at org.apache.spark.streaming.api.java.JavaDStreamLike$$anonfun$foreachRDD$2.apply(JavaDStreamLike.scala:280) at org.apache.spark.streaming.dstream.ForEachDStream$$anonfun$1$$anonfun$apply$mcV$sp$1.apply$mcV$sp(ForEachDStream.scala:51) at org.apache.spark.streaming.dstream.ForEachDStream$$anonfun$1$$anonfun$apply$mcV$sp$1.apply(ForEachDStream.scala:51) at org.apache.spark.streaming.dstream.ForEachDStream$$anonfun$1$$anonfun$apply$mcV$sp$1.apply(ForEachDStream.scala:51) at org.apache.spark.streaming.dstream.DStream.createRDDWithLocalProperties(DStream.scala:416) at org.apache.spark.streaming.dstream.ForEachDStream$$anonfun$1.apply$mcV$sp(ForEachDStream.scala:50) at org.apache.spark.streaming.dstream.ForEachDStream$$anonfun$1.apply(ForEachDStream.scala:50) at org.apache.spark.streaming.dstream.ForEachDStream$$anonfun$1.apply(ForEachDStream.scala:50) at scala.util.Try$.apply(Try.scala:192) at org.apache.spark.streaming.scheduler.Job.run(Job.scala:39) at org.apache.spark.streaming.scheduler.JobScheduler$JobHandler$$anonfun$run$1.apply$mcV$sp(JobScheduler.scala:257) at org.apache.spark.streaming.scheduler.JobScheduler$JobHandler$$anonfun$run$1.apply(JobScheduler.scala:257) at org.apache.spark.streaming.scheduler.JobScheduler$JobHandler$$anonfun$run$1.apply(JobScheduler.scala:257) at scala.util.DynamicVariable.withValue(DynamicVariable.scala:58) at org.apache.spark.streaming.scheduler.JobScheduler$JobHandler.run(JobScheduler.scala:256) at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1149) at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:624) at java.lang.Thread.run(Thread.java:748) 19/08/13 19:54:02 ERROR ApplicationMaster: User class threw exception: org.apache.spark.sql.catalyst.analysis.NoSuchDatabaseException: Database 'meta_voice' not found; org.apache.spark.sql.catalyst.analysis.NoSuchDatabaseException: Database 'meta_voice' not found; at org.apache.spark.sql.catalyst.catalog.ExternalCatalog.requireDbExists(ExternalCatalog.scala:40) at org.apache.spark.sql.catalyst.catalog.InMemoryCatalog.tableExists(InMemoryCatalog.scala:331) at org.apache.spark.sql.catalyst.catalog.SessionCatalog.tableExists(SessionCatalog.scala:388) at org.apache.spark.sql.DataFrameWriter.saveAsTable(DataFrameWriter.scala:398) at org.apache.spark.sql.DataFrameWriter.saveAsTable(DataFrameWriter.scala:393) at com.stream.VoiceApplication2$2.call(VoiceApplication2.java:122) at com.stream.VoiceApplication2$2.call(VoiceApplication2.java:115) at org.apache.spark.streaming.api.java.JavaDStreamLike$$anonfun$foreachRDD$2.apply(JavaDStreamLike.scala:280) at org.apache.spark.streaming.api.java.JavaDStreamLike$$anonfun$foreachRDD$2.apply(JavaDStreamLike.scala:280) at org.apache.spark.streaming.dstream.ForEachDStream$$anonfun$1$$anonfun$apply$mcV$sp$1.apply$mcV$sp(ForEachDStream.scala:51) at org.apache.spark.streaming.dstream.ForEachDStream$$anonfun$1$$anonfun$apply$mcV$sp$1.apply(ForEachDStream.scala:51) at org.apache.spark.streaming.dstream.ForEachDStream$$anonfun$1$$anonfun$apply$mcV$sp$1.apply(ForEachDStream.scala:51) at org.apache.spark.streaming.dstream.DStream.createRDDWithLocalProperties(DStream.scala:416) at org.apache.spark.streaming.dstream.ForEachDStream$$anonfun$1.apply$mcV$sp(ForEachDStream.scala:50) at org.apache.spark.streaming.dstream.ForEachDStream$$anonfun$1.apply(ForEachDStream.scala:50) at org.apache.spark.streaming.dstream.ForEachDStream$$anonfun$1.apply(ForEachDStream.scala:50) at scala.util.Try$.apply(Try.scala:192) at org.apache.spark.streaming.scheduler.Job.run(Job.scala:39) at org.apache.spark.streaming.scheduler.JobScheduler$JobHandler$$anonfun$run$1.apply$mcV$sp(JobScheduler.scala:257) at org.apache.spark.streaming.scheduler.JobScheduler$JobHandler$$anonfun$run$1.apply(JobScheduler.scala:257) at org.apache.spark.streaming.scheduler.JobScheduler$JobHandler$$anonfun$run$1.apply(JobScheduler.scala:257) at scala.util.DynamicVariable.withValue(DynamicVariable.scala:58) at org.apache.spark.streaming.scheduler.JobScheduler$JobHandler.run(JobScheduler.scala:256) at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1149) at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:624) at java.lang.Thread.run(Thread.java:748) 19/08/13 19:54:02 INFO ApplicationMaster: Final app status: FAILED, exitCode: 15, (reason: User class threw exception: org.apache.spark.sql.catalyst.analysis.NoSuchDatabaseException: Database 'meta_voice' not found; at org.apache.spark.sql.catalyst.catalog.ExternalCatalog.requireDbExists(ExternalCatalog.scala:40) at org.apache.spark.sql.catalyst.catalog.InMemoryCatalog.tableExists(InMemoryCatalog.scala:331) at org.apache.spark.sql.catalyst.catalog.SessionCatalog.tableExists(SessionCatalog.scala:388) at org.apache.spark.sql.DataFrameWriter.saveAsTable(DataFrameWriter.scala:398) at org.apache.spark.sql.DataFrameWriter.saveAsTable(DataFrameWriter.scala:393) at com.stream.VoiceApplication2$2.call(VoiceApplication2.java:122) at com.stream.VoiceApplication2$2.call(VoiceApplication2.java:115) at org.apache.spark.streaming.api.java.JavaDStreamLike$$anonfun$foreachRDD$2.apply(JavaDStreamLike.scala:280) at org.apache.spark.streaming.api.java.JavaDStreamLike$$anonfun$foreachRDD$2.apply(JavaDStreamLike.scala:280) at org.apache.spark.streaming.dstream.ForEachDStream$$anonfun$1$$anonfun$apply$mcV$sp$1.apply$mcV$sp(ForEachDStream.scala:51) at org.apache.spark.streaming.dstream.ForEachDStream$$anonfun$1$$anonfun$apply$mcV$sp$1.apply(ForEachDStream.scala:51) at org.apache.spark.streaming.dstream.ForEachDStream$$anonfun$1$$anonfun$apply$mcV$sp$1.apply(ForEachDStream.scala:51) at org.apache.spark.streaming.dstream.DStream.createRDDWithLocalProperties(DStream.scala:416) at org.apache.spark.streaming.dstream.ForEachDStream$$anonfun$1.apply$mcV$sp(ForEachDStream.scala:50) at org.apache.spark.streaming.dstream.ForEachDStream$$anonfun$1.apply(ForEachDStream.scala:50) at org.apache.spark.streaming.dstream.ForEachDStream$$anonfun$1.apply(ForEachDStream.scala:50) at scala.util.Try$.apply(Try.scala:192) at org.apache.spark.streaming.scheduler.Job.run(Job.scala:39) at org.apache.spark.streaming.scheduler.JobScheduler$JobHandler$$anonfun$run$1.apply$mcV$sp(JobScheduler.scala:257) at org.apache.spark.streaming.scheduler.JobScheduler$JobHandler$$anonfun$run$1.apply(JobScheduler.scala:257) at org.apache.spark.streaming.scheduler.JobScheduler$JobHandler$$anonfun$run$1.apply(JobScheduler.scala:257) at scala.util.DynamicVariable.withValue(DynamicVariable.scala:58) at org.apache.spark.streaming.scheduler.JobScheduler$JobHandler.run(JobScheduler.scala:256) at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1149) at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:624) at java.lang.Thread.run(Thread.java:748) ) 19/08/13 19:54:02 INFO StreamingContext: Invoking stop(stopGracefully=true) from shutdown hook 19/08/13 19:54:02 INFO ReceiverTracker: Sent stop signal to all 1 receivers 19/08/13 19:54:02 ERROR ReceiverTracker: Deregistered receiver for stream 0: Stopped by driver 19/08/13 19:54:02 INFO TaskSetManager: Finished task 0.0 in stage 2.0 (TID 70) in 35055 ms on datanode01 (executor 2) (1/1) 19/08/13 19:54:02 INFO YarnClusterScheduler: Removed TaskSet 2.0, whose tasks have all completed, from pool 19/08/13 19:54:02 INFO DAGScheduler: ResultStage 2 (start at VoiceApplication2.java:128) finished in 35.086 s 19/08/13 19:54:02 INFO ReceiverTracker: Waiting for receiver job to terminate gracefully 19/08/13 19:54:02 INFO ReceiverTracker: Waited for receiver job to terminate gracefully 19/08/13 19:54:02 INFO ReceiverTracker: All of the receivers have deregistered successfully 19/08/13 19:54:02 INFO ReceiverTracker: ReceiverTracker stopped 19/08/13 19:54:02 INFO JobGenerator: Stopping JobGenerator gracefully 19/08/13 19:54:02 INFO JobGenerator: Waiting for all received blocks to be consumed for job generation 19/08/13 19:54:02 INFO JobGenerator: Waited for all received blocks to be consumed for job generation 19/08/13 19:54:12 WARN ShutdownHookManager: ShutdownHook '$anon$2' timeout, java.util.concurrent.TimeoutException java.util.concurrent.TimeoutException at java.util.concurrent.FutureTask.get(FutureTask.java:205) at org.apache.hadoop.util.ShutdownHookManager$1.run(ShutdownHookManager.java:67) 19/08/13 19:54:12 ERROR Utils: Uncaught exception in thread pool-1-thread-1 java.lang.InterruptedException at java.lang.Object.wait(Native Method) at java.lang.Thread.join(Thread.java:1252) at java.lang.Thread.join(Thread.java:1326) at org.apache.spark.streaming.util.RecurringTimer.stop(RecurringTimer.scala:86) at org.apache.spark.streaming.scheduler.JobGenerator.stop(JobGenerator.scala:137) at org.apache.spark.streaming.scheduler.JobScheduler.stop(JobScheduler.scala:123) at org.apache.spark.streaming.StreamingContext$$anonfun$stop$1.apply$mcV$sp(StreamingContext.scala:681) at org.apache.spark.util.Utils$.tryLogNonFatalError(Utils.scala:1357) at org.apache.spark.streaming.StreamingContext.stop(StreamingContext.scala:680) at org.apache.spark.streaming.StreamingContext.org$apache$spark$streaming$StreamingContext$$stopOnShutdown(StreamingContext.scala:714) at org.apache.spark.streaming.StreamingContext$$anonfun$start$1.apply$mcV$sp(StreamingContext.scala:599) at org.apache.spark.util.SparkShutdownHook.run(ShutdownHookManager.scala:216) at org.apache.spark.util.SparkShutdownHookManager$$anonfun$runAll$1$$anonfun$apply$mcV$sp$1.apply$mcV$sp(ShutdownHookManager.scala:188) at org.apache.spark.util.SparkShutdownHookManager$$anonfun$runAll$1$$anonfun$apply$mcV$sp$1.apply(ShutdownHookManager.scala:188) at org.apache.spark.util.SparkShutdownHookManager$$anonfun$runAll$1$$anonfun$apply$mcV$sp$1.apply(ShutdownHookManager.scala:188) at org.apache.spark.util.Utils$.logUncaughtExceptions(Utils.scala:1988) at org.apache.spark.util.SparkShutdownHookManager$$anonfun$runAll$1.apply$mcV$sp(ShutdownHookManager.scala:188) at org.apache.spark.util.SparkShutdownHookManager$$anonfun$runAll$1.apply(ShutdownHookManager.scala:188) at org.apache.spark.util.SparkShutdownHookManager$$anonfun$runAll$1.apply(ShutdownHookManager.scala:188) at scala.util.Try$.apply(Try.scala:192) at org.apache.spark.util.SparkShutdownHookManager.runAll(ShutdownHookManager.scala:188) at org.apache.spark.util.SparkShutdownHookManager$$anon$2.run(ShutdownHookManager.scala:178) at java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:511) at java.util.concurrent.FutureTask.run(FutureTask.java:266) at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1149) at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:624) at java.lang.Thread.run(Thread.java:748) ```
spark通过jdbc读取hive的表报错,我是在zeppelin里运行的
## 代码: import org.apache.spark.sql.hive.HiveContext val pro = new java.util.Properties() pro.setProperty("user", "****") pro.setProperty("password", "*****") val driverName = "org.apache.hadoop.hive.jdbc.HiveDriver"; Class.forName(driverName); val hiveContext = new HiveContext(sc) val hivetable = hiveContext.read.jdbc("jdbc:hive://*****/default", "*****", pro); ## 错误: import org.apache.spark.sql.hive.HiveContext pro: java.util.Properties = {} res15: Object = null res16: Object = null driverName: String = org.apache.hadoop.hive.jdbc.HiveDriver res17: Class[_] = class org.apache.hadoop.hive.jdbc.HiveDriver warning: there was one deprecation warning; re-run with -deprecation for details hiveContext: org.apache.spark.sql.hive.HiveContext = org.apache.spark.sql.hive.HiveContext@14f9cc13 java.sql.SQLException: Method not supported at org.apache.hadoop.hive.jdbc.HiveResultSetMetaData.isSigned(Unknown Source) at org.apache.spark.sql.execution.datasources.jdbc.JdbcUtils$.getSchema(JdbcUtils.scala:232) at org.apache.spark.sql.execution.datasources.jdbc.JDBCRDD$.resolveTable(JDBCRDD.scala:64) at org.apache.spark.sql.execution.datasources.jdbc.JDBCRelation.<init>(JDBCRelation.scala:113) at org.apache.spark.sql.execution.datasources.jdbc.JdbcRelationProvider.createRelation(JdbcRelationProvider.scala:45) at org.apache.spark.sql.execution.datasources.DataSource.resolveRelation(DataSource.scala:330) at org.apache.spark.sql.DataFrameReader.load(DataFrameReader.scala:152) at org.apache.spark.sql.DataFrameReader.load(DataFrameReader.scala:125) at org.apache.spark.sql.DataFrameReader.jdbc(DataFrameReader.scala:166) ... 46 elided
datax 从hive导出数据到mysql时 字段分隔符的配置
hive中建表时指定字段分割符为 \t,配置导出的json时,配置字段分隔符为 \t,导出失败,原因是从hive读取数据时只能识别一个字段,显然字段分割没有成功。。网上百度说字段分隔符默认应设置为 \u0001,但这是在hive上建表时没有指定字段分割符时。。想搞明白这是什么原因,比如我的 \t 在导出时为什么不能使用呢,是hive无法识别吗?我建表的时候就是 \t 啊,是需要转成什么字符集吗。。在字符集这块比较晕,请大神指点指点[face]qq:83.gif[/face]
SparkSql中读取hive中的表不能存在"."
val hiveDeptDF = sqlContext.read.table("emp_test.emp") 我要读取hive中emp_test中的emp表,报错不能包含“.” Exception in thread "main" org.apache.spark.sql.AnalysisException: Specifying database name or other qualifiers are not allowed for temporary tables. If the table name has dots (.) in it, please quote the table name with backticks (`).; at org.apache.spark.sql.catalyst.analysis.Catalog$class.getTableName(Catalog.scala:70) at org.apache.spark.sql.catalyst.analysis.SimpleCatalog.getTableName(Catalog.scala:82) at org.apache.spark.sql.catalyst.analysis.SimpleCatalog.lookupRelation(Catalog.scala:104) at org.apache.spark.sql.DataFrameReader.table(DataFrameReader.scala:338) at Hive2Rdbms$.main(Hive2Rdbms.scala:16) at Hive2Rdbms.main(Hive2Rdbms.scala) at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method) at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:57) at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43) at java.lang.reflect.Method.invoke(Method.java:606) at com.intellij.rt.execution.application.AppMain.main(AppMain.java:140) 我加上反引号后,又显示找不到该表。 hive库本身没问题
datax从hive导出数据到mysql
从MySQL导入数据到hive上是没问题的,从hive上导出数据时,提示: [您的配置错误.]. - 列配置信息有错误. 因为您配置的任务中,源头读取字段数:1 与 目的表要写入的字段数:4 不相等. 请检查您的配置并作出修改. 下面是我的json文件: ```{ "job": { "content": [{ "reader": { "parameter": { "path": "/apps/hive/warehouse/test.db/job01", "column": ["*"], "defaultFS": "hdfs://xxxx.xx.xx:8020", "encoding": "utf-8", "fieldDelimiter": "\u0001", "fileType": "text" }, "name": "hdfsreader" }, "writer": { "parameter": { "password": "*****", "column": ["*"], "connection": [{ "jdbcUrl": "jdbc:mysql://xxxxx:3308/groundcherry", "table": ["scoop_test"] }], "writeMode": "insert", "username": "****" }, "name": "mysqlwriter" } }], "setting": { "speed": { "channel": 1 } } } } ``` 网上说是分割符的问题,使用默认的 \u0001,导入的时候可以导入进去,导出的时候就报错了,使用","号也是一样,hive上建表时指定的字段分隔符与这个是一致的,有点怀疑是不是字段分割的原因,还会有什么原因导致这种错误呢,请大家指点指点谢谢
Hive on spark查询报错。
求助!!!在hadoop使用Hive on spark执行Bigbench测试时,一直会有报错,log信息: FAILED: SemanticException Failed to get a spark session: org.apache.hadoop.hive.ql.metadata.HiveException: Failed to create spark client. WARN: The method class org.apache.commons.logging.impl.SLF4JLogFactory#release() was invoked. WARN: Please see http://www.slf4j.org/codes.html#release for an explanation. An error occured while running command: ========== runEngineCmd -f /var/lib/hadoop-hdfs/Big-Bench/engines/hive/queries/q04/q04.sql ========== 在网上查了很多资料,有说版本不匹配的,有说是概率性问题,有没有大佬来瞅一眼啊。。哭了
impala读取hive元数据问题
hive可以正常使用,切换成impal时可以读取到hive库表元数据,单数读取不到标的字段信息,查询时就报错![图片说明](https://img-ask.csdn.net/upload/201808/06/1533552128_262405.png) 请教各位大神,又遇到过类似问题么?
Hive on Spark下无法处理Parquet表
我在使用Hive on Spark时,在搭建过程中不段踩坑,网上资料也有,但都是千篇一律,点到为止,欲言又止,明明在说却又故意不说清楚的那种,看着让人很蛋疼. 过程是这样的,我在Spark的官网查到,要使用Hive on Spark必须有一个不包含一个Hive的Spark部署包,而官网上的都是带Hive的,那么就只剩一个办法了,自己编译 编译的方法有很多种,官方只要介绍有三种,一种是Spark自带的make-distribution.sh编译工具,第二种是使用Maven编译,第三那种是使用SBT去编译,我一开始选择了Spark自带的make-distribution.sh编译工具,编译过程是令人发疯的,不断报错,不断报错,最后还是让我编译成功了,我的方法是,报错了,重新指令指令编译,不断重复次步骤. 我在用make-distribution.sh编译工具时的指令如下: ./make-distribution.sh --name "hadoop2-without-hive" --tgz "-Pyarn,hadoop-provided,hadoop-2.6,parquet-provided" 但是编译出来的spark-assembly-*.jar 包只有106M,然后安装部署spark后,却连启动都报错,我就去上网找资料,但是在网上找的资料是,有人通过make-distribution.sh编译工具编译,但是他竟然成功了,完全没报错?????有人也是通过make-distribution.sh编译工具编译的,结果跟我一样,也是报错,他后来才用Maven编译,成功了,没办法,我通过make-distribution.sh编译不成功只能也用Maven编译,后来确实也编译成功了,安装,运行一点问题都没有,我用Maven编译的指令如下: mvn -Phadoop-2.6 -Pyarn -Dhadoop.version=2.6.5 -Dyarn.version=2.6.5 -Dscala-2.10 -DskipTests clean package 当我以为一切都搞定了的时候,问题又来了,因为我需要用到将Hive中的数据以parquet格式进行存储,到了这时它又报错了,报错信息如下: Caused by: java.io.IOException: java.lang.reflect.InvocationTargetException Caused by: java.lang.reflect.InvocationTargetException Caused by: java.lang.NoSuchMethodError:org.apache.parquet.schema.Types$MessageTypeBuilder.addFields([Lorg/apache/parquet/schema/Type;)Lorg/apache/parquet/schema/Types$BaseGroupBuilder; 然后我就去找资料,百度,Google,Bing都找过了,愣是没找到问题在哪里,我就懵逼了,到底这个问题怎么解决啊? 求前辈赐教
zeppelin连接hive和spark遇到的问题
1.连接hive的时候 zeppelin使用hiveserver2连接hive,由于元数据过多,赶脚zeppelin每次都在遍历元数据,每次执行语句都有1个多小时的延迟 2.连接sparksql报错 java.lang.NoSuchFieldError: HIVE_STATS_JDBC_TIMEOUT at org.apache.spark.sql.hive.HiveUtils$.hiveClientConfig
hive on spark 和 spark sql 有啥区别?
hive on spark 和 spark sql 都是用spark引擎计算,个人觉得没啥区别。 网友说: hive on spark 是cloudera公司开发的,spark sql是spark开发的,这个算是区别吗? 写法不同? 请大神解答。
我把hive-site.xml放进spark/conf/里后报了一堆警告,怎么处理,不处理有影响吗?
之前配置的时候一直没发现忘记把hive-site.xml配置文件放到spark/conf中,今天把文件放进去,结果一打开pyspark就报一堆错,使用sparksql的时候也是报一堆警告,警告如下: 因为太长,所以先把想法写在这。我想知道怎么可以把这个提示的等级调高,或者怎么可以解决这些警告,麻烦大佬们帮忙看看,谢谢! ```shell To adjust logging level use sc.setLogLevel(newLevel). For SparkR, use setLogLevel(newLevel). 2019-11-14 17:13:50,994 WARN conf.HiveConf: HiveConf of name hive.metastore.client.capability.check does not exist 2019-11-14 17:13:50,994 WARN conf.HiveConf: HiveConf of name hive.metastore.hbase.aggregate.stats.false.positive.probability does not exist 2019-11-14 17:13:50,994 WARN conf.HiveConf: HiveConf of name hive.druid.broker.address.default does not exist 2019-11-14 17:13:50,995 WARN conf.HiveConf: HiveConf of name hive.llap.io.orc.time.counters does not exist 2019-11-14 17:13:50,995 WARN conf.HiveConf: HiveConf of name hive.tez.task.scale.memory.reserve-fraction.min does not exist 2019-11-14 17:13:50,995 WARN conf.HiveConf: HiveConf of name hive.orc.splits.ms.footer.cache.ppd.enabled does not exist 2019-11-14 17:13:50,995 WARN conf.HiveConf: HiveConf of name hive.metastore.event.message.factory does not exist 2019-11-14 17:13:50,995 WARN conf.HiveConf: HiveConf of name hive.server2.metrics.enabled does not exist 2019-11-14 17:13:50,995 WARN conf.HiveConf: HiveConf of name hive.tez.hs2.user.access does not exist 2019-11-14 17:13:50,995 WARN conf.HiveConf: HiveConf of name hive.druid.storage.storageDirectory does not exist 2019-11-14 17:13:50,995 WARN conf.HiveConf: HiveConf of name hive.llap.am.liveness.connection.timeout.ms does not exist 2019-11-14 17:13:50,995 WARN conf.HiveConf: HiveConf of name hive.tez.dynamic.semijoin.reduction.threshold does not exist 2019-11-14 17:13:50,995 WARN conf.HiveConf: HiveConf of name hive.server2.thrift.client.connect.retry.limit does not exist 2019-11-14 17:13:50,995 WARN conf.HiveConf: HiveConf of name hive.llap.daemon.xmx.headroom does not exist 2019-11-14 17:13:50,995 WARN conf.HiveConf: HiveConf of name hive.tez.dynamic.semijoin.reduction does not exist 2019-11-14 17:13:50,995 WARN conf.HiveConf: HiveConf of name hive.llap.io.allocator.direct does not exist 2019-11-14 17:13:50,995 WARN conf.HiveConf: HiveConf of name hive.llap.auto.enforce.stats does not exist 2019-11-14 17:13:50,995 WARN conf.HiveConf: HiveConf of name hive.llap.client.consistent.splits does not exist 2019-11-14 17:13:50,995 WARN conf.HiveConf: HiveConf of name hive.server2.tez.session.lifetime does not exist 2019-11-14 17:13:50,995 WARN conf.HiveConf: HiveConf of name hive.timedout.txn.reaper.start does not exist 2019-11-14 17:13:50,995 WARN conf.HiveConf: HiveConf of name hive.metastore.hbase.cache.ttl does not exist 2019-11-14 17:13:50,995 WARN conf.HiveConf: HiveConf of name hive.llap.management.acl does not exist 2019-11-14 17:13:50,995 WARN conf.HiveConf: HiveConf of name hive.llap.daemon.delegation.token.lifetime does not exist 2019-11-14 17:13:50,995 WARN conf.HiveConf: HiveConf of name hive.server2.authentication.ldap.guidKey does not exist 2019-11-14 17:13:50,995 WARN conf.HiveConf: HiveConf of name hive.ats.hook.queue.capacity does not exist 2019-11-14 17:13:50,995 WARN conf.HiveConf: HiveConf of name hive.strict.checks.large.query does not exist 2019-11-14 17:13:50,995 WARN conf.HiveConf: HiveConf of name hive.tez.bigtable.minsize.semijoin.reduction does not exist 2019-11-14 17:13:50,995 WARN conf.HiveConf: HiveConf of name hive.llap.io.allocator.alloc.min does not exist 2019-11-14 17:13:50,995 WARN conf.HiveConf: HiveConf of name hive.server2.thrift.client.user does not exist 2019-11-14 17:13:50,995 WARN conf.HiveConf: HiveConf of name hive.llap.io.encode.alloc.size does not exist 2019-11-14 17:13:50,995 WARN conf.HiveConf: HiveConf of name hive.llap.daemon.wait.queue.comparator.class.name does not exist 2019-11-14 17:13:50,995 WARN conf.HiveConf: HiveConf of name hive.llap.daemon.output.service.port does not exist 2019-11-14 17:13:50,995 WARN conf.HiveConf: HiveConf of name hive.orc.cache.use.soft.references does not exist 2019-11-14 17:13:50,996 WARN conf.HiveConf: HiveConf of name hive.llap.io.encode.enabled does not exist 2019-11-14 17:13:50,996 WARN conf.HiveConf: HiveConf of name hive.tez.task.scale.memory.reserve.fraction.max does not exist 2019-11-14 17:13:50,996 WARN conf.HiveConf: HiveConf of name hive.llap.task.communicator.listener.thread-count does not exist 2019-11-14 17:13:50,996 WARN conf.HiveConf: HiveConf of name hive.tez.container.max.java.heap.fraction does not exist 2019-11-14 17:13:50,996 WARN conf.HiveConf: HiveConf of name hive.stats.column.autogather does not exist 2019-11-14 17:13:50,996 WARN conf.HiveConf: HiveConf of name hive.llap.daemon.am.liveness.heartbeat.interval.ms does not exist 2019-11-14 17:13:50,996 WARN conf.HiveConf: HiveConf of name hive.llap.io.decoding.metrics.percentiles.intervals does not exist 2019-11-14 17:13:50,996 WARN conf.HiveConf: HiveConf of name hive.groupby.position.alias does not exist 2019-11-14 17:13:50,996 WARN conf.HiveConf: HiveConf of name hive.metastore.txn.store.impl does not exist 2019-11-14 17:13:50,996 WARN conf.HiveConf: HiveConf of name hive.spark.use.groupby.shuffle does not exist 2019-11-14 17:13:50,996 WARN conf.HiveConf: HiveConf of name hive.llap.object.cache.enabled does not exist 2019-11-14 17:13:50,996 WARN conf.HiveConf: HiveConf of name hive.server2.parallel.ops.in.session does not exist 2019-11-14 17:13:50,996 WARN conf.HiveConf: HiveConf of name hive.groupby.limit.extrastep does not exist 2019-11-14 17:13:50,996 WARN conf.HiveConf: HiveConf of name hive.server2.webui.use.ssl does not exist 2019-11-14 17:13:50,996 WARN conf.HiveConf: HiveConf of name hive.service.metrics.file.location does not exist 2019-11-14 17:13:50,996 WARN conf.HiveConf: HiveConf of name hive.server2.thrift.client.retry.delay.seconds does not exist 2019-11-14 17:13:50,996 WARN conf.HiveConf: HiveConf of name hive.materializedview.fileformat does not exist 2019-11-14 17:13:50,996 WARN conf.HiveConf: HiveConf of name hive.llap.daemon.num.file.cleaner.threads does not exist 2019-11-14 17:13:50,996 WARN conf.HiveConf: HiveConf of name hive.test.fail.compaction does not exist 2019-11-14 17:13:50,996 WARN conf.HiveConf: HiveConf of name hive.blobstore.use.blobstore.as.scratchdir does not exist 2019-11-14 17:13:50,996 WARN conf.HiveConf: HiveConf of name hive.service.metrics.class does not exist 2019-11-14 17:13:50,996 WARN conf.HiveConf: HiveConf of name hive.llap.io.allocator.mmap.path does not exist 2019-11-14 17:13:50,996 WARN conf.HiveConf: HiveConf of name hive.llap.daemon.download.permanent.fns does not exist 2019-11-14 17:13:50,996 WARN conf.HiveConf: HiveConf of name hive.server2.webui.max.historic.queries does not exist 2019-11-14 17:13:50,996 WARN conf.HiveConf: HiveConf of name hive.vectorized.execution.reducesink.new.enabled does not exist 2019-11-14 17:13:50,996 WARN conf.HiveConf: HiveConf of name hive.compactor.max.num.delta does not exist 2019-11-14 17:13:50,996 WARN conf.HiveConf: HiveConf of name hive.compactor.history.retention.attempted does not exist 2019-11-14 17:13:50,996 WARN conf.HiveConf: HiveConf of name hive.server2.webui.port does not exist 2019-11-14 17:13:50,999 WARN conf.HiveConf: HiveConf of name hive.compactor.initiator.failed.compacts.threshold does not exist 2019-11-14 17:13:50,999 WARN conf.HiveConf: HiveConf of name hive.service.metrics.reporter does not exist 2019-11-14 17:13:50,999 WARN conf.HiveConf: HiveConf of name hive.llap.daemon.output.service.max.pending.writes does not exist 2019-11-14 17:13:50,999 WARN conf.HiveConf: HiveConf of name hive.llap.execution.mode does not exist 2019-11-14 17:13:50,999 WARN conf.HiveConf: HiveConf of name hive.llap.enable.grace.join.in.llap does not exist 2019-11-14 17:13:50,999 WARN conf.HiveConf: HiveConf of name hive.optimize.limittranspose does not exist 2019-11-14 17:13:51,000 WARN conf.HiveConf: HiveConf of name hive.llap.io.memory.mode does not exist 2019-11-14 17:13:51,000 WARN conf.HiveConf: HiveConf of name hive.llap.io.threadpool.size does not exist 2019-11-14 17:13:51,000 WARN conf.HiveConf: HiveConf of name hive.druid.select.threshold does not exist 2019-11-14 17:13:51,000 WARN conf.HiveConf: HiveConf of name hive.scratchdir.lock does not exist 2019-11-14 17:13:51,000 WARN conf.HiveConf: HiveConf of name hive.server2.webui.use.spnego does not exist 2019-11-14 17:13:51,000 WARN conf.HiveConf: HiveConf of name hive.service.metrics.file.frequency does not exist 2019-11-14 17:13:51,000 WARN conf.HiveConf: HiveConf of name hive.llap.hs2.coordinator.enabled does not exist 2019-11-14 17:13:51,000 WARN conf.HiveConf: HiveConf of name hive.llap.task.scheduler.timeout.seconds does not exist 2019-11-14 17:13:51,000 WARN conf.HiveConf: HiveConf of name hive.optimize.filter.stats.reduction does not exist 2019-11-14 17:13:51,000 WARN conf.HiveConf: HiveConf of name hive.exec.orc.base.delta.ratio does not exist 2019-11-14 17:13:51,000 WARN conf.HiveConf: HiveConf of name hive.metastore.fastpath does not exist 2019-11-14 17:13:51,000 WARN conf.HiveConf: HiveConf of name hive.server2.clear.dangling.scratchdir does not exist 2019-11-14 17:13:51,000 WARN conf.HiveConf: HiveConf of name hive.test.fail.heartbeater does not exist 2019-11-14 17:13:51,000 WARN conf.HiveConf: HiveConf of name hive.llap.file.cleanup.delay.seconds does not exist 2019-11-14 17:13:51,000 WARN conf.HiveConf: HiveConf of name hive.llap.management.rpc.port does not exist 2019-11-14 17:13:51,000 WARN conf.HiveConf: HiveConf of name hive.mapjoin.hybridgrace.bloomfilter does not exist 2019-11-14 17:13:51,000 WARN conf.HiveConf: HiveConf of name hive.llap.auto.enforce.tree does not exist 2019-11-14 17:13:51,000 WARN conf.HiveConf: HiveConf of name hive.metastore.stats.ndv.tuner does not exist 2019-11-14 17:13:51,000 WARN conf.HiveConf: HiveConf of name hive.direct.sql.max.query.length does not exist 2019-11-14 17:13:51,000 WARN conf.HiveConf: HiveConf of name hive.compactor.history.retention.failed does not exist 2019-11-14 17:13:51,000 WARN conf.HiveConf: HiveConf of name hive.server2.close.session.on.disconnect does not exist 2019-11-14 17:13:51,000 WARN conf.HiveConf: HiveConf of name hive.optimize.ppd.windowing does not exist 2019-11-14 17:13:51,000 WARN conf.HiveConf: HiveConf of name hive.metastore.initial.metadata.count.enabled does not exist 2019-11-14 17:13:51,000 WARN conf.HiveConf: HiveConf of name hive.server2.webui.host does not exist 2019-11-14 17:13:51,000 WARN conf.HiveConf: HiveConf of name hive.orc.splits.ms.footer.cache.enabled does not exist 2019-11-14 17:13:51,000 WARN conf.HiveConf: HiveConf of name hive.optimize.point.lookup.min does not exist 2019-11-14 17:13:51,000 WARN conf.HiveConf: HiveConf of name hive.metastore.hbase.file.metadata.threads does not exist 2019-11-14 17:13:51,000 WARN conf.HiveConf: HiveConf of name hive.llap.daemon.service.refresh.interval.sec does not exist 2019-11-14 17:13:51,001 WARN conf.HiveConf: HiveConf of name hive.llap.auto.max.output.size does not exist 2019-11-14 17:13:51,001 WARN conf.HiveConf: HiveConf of name hive.driver.parallel.compilation does not exist 2019-11-14 17:13:51,001 WARN conf.HiveConf: HiveConf of name hive.llap.remote.token.requires.signing does not exist 2019-11-14 17:13:51,001 WARN conf.HiveConf: HiveConf of name hive.tez.bucket.pruning does not exist 2019-11-14 17:13:51,001 WARN conf.HiveConf: HiveConf of name hive.llap.cache.allow.synthetic.fileid does not exist 2019-11-14 17:13:51,001 WARN conf.HiveConf: HiveConf of name hive.hash.table.inflation.factor does not exist 2019-11-14 17:13:51,001 WARN conf.HiveConf: HiveConf of name hive.metastore.hbase.aggr.stats.hbase.ttl does not exist 2019-11-14 17:13:51,001 WARN conf.HiveConf: HiveConf of name hive.llap.auto.enforce.vectorized does not exist 2019-11-14 17:13:51,001 WARN conf.HiveConf: HiveConf of name hive.writeset.reaper.interval does not exist 2019-11-14 17:13:51,001 WARN conf.HiveConf: HiveConf of name hive.vectorized.use.vector.serde.deserialize does not exist 2019-11-14 17:13:51,001 WARN conf.HiveConf: HiveConf of name hive.order.columnalignment does not exist 2019-11-14 17:13:51,001 WARN conf.HiveConf: HiveConf of name hive.llap.daemon.output.service.send.buffer.size does not exist 2019-11-14 17:13:51,001 WARN conf.HiveConf: HiveConf of name hive.exec.schema.evolution does not exist 2019-11-14 17:13:51,001 WARN conf.HiveConf: HiveConf of name hive.direct.sql.max.elements.values.clause does not exist 2019-11-14 17:13:51,001 WARN conf.HiveConf: HiveConf of name hive.server2.llap.concurrent.queries does not exist 2019-11-14 17:13:51,001 WARN conf.HiveConf: HiveConf of name hive.llap.auto.allow.uber does not exist 2019-11-14 17:13:51,001 WARN conf.HiveConf: HiveConf of name hive.druid.indexer.partition.size.max does not exist 2019-11-14 17:13:51,001 WARN conf.HiveConf: HiveConf of name hive.llap.auto.auth does not exist 2019-11-14 17:13:51,001 WARN conf.HiveConf: HiveConf of name hive.orc.splits.include.fileid does not exist 2019-11-14 17:13:51,001 WARN conf.HiveConf: HiveConf of name hive.llap.daemon.communicator.num.threads does not exist 2019-11-14 17:13:51,001 WARN conf.HiveConf: HiveConf of name hive.orderby.position.alias does not exist 2019-11-14 17:13:51,001 WARN conf.HiveConf: HiveConf of name hive.llap.task.communicator.connection.sleep.between.retries.ms does not exist 2019-11-14 17:13:51,001 WARN conf.HiveConf: HiveConf of name hive.metastore.hbase.aggregate.stats.max.partitions does not exist 2019-11-14 17:13:51,001 WARN conf.HiveConf: HiveConf of name hive.service.metrics.hadoop2.component does not exist 2019-11-14 17:13:51,001 WARN conf.HiveConf: HiveConf of name hive.llap.daemon.yarn.shuffle.port does not exist 2019-11-14 17:13:51,001 WARN conf.HiveConf: HiveConf of name hive.direct.sql.max.elements.in.clause does not exist 2019-11-14 17:13:51,001 WARN conf.HiveConf: HiveConf of name hive.druid.passiveWaitTimeMs does not exist 2019-11-14 17:13:51,001 WARN conf.HiveConf: HiveConf of name hive.load.dynamic.partitions.thread does not exist 2019-11-14 17:13:51,002 WARN conf.HiveConf: HiveConf of name hive.druid.indexer.segments.granularity does not exist 2019-11-14 17:13:51,002 WARN conf.HiveConf: HiveConf of name hive.server2.thrift.http.response.header.size does not exist 2019-11-14 17:13:51,002 WARN conf.HiveConf: HiveConf of name hive.conf.internal.variable.list does not exist 2019-11-14 17:13:51,002 WARN conf.HiveConf: HiveConf of name hive.optimize.limittranspose.reductionpercentage does not exist 2019-11-14 17:13:51,002 WARN conf.HiveConf: HiveConf of name hive.repl.cm.enabled does not exist 2019-11-14 17:13:51,002 WARN conf.HiveConf: HiveConf of name hive.server2.thrift.client.retry.limit does not exist 2019-11-14 17:13:51,002 WARN conf.HiveConf: HiveConf of name hive.server2.thrift.resultset.serialize.in.tasks does not exist 2019-11-14 17:13:51,002 WARN conf.HiveConf: HiveConf of name hive.enable.spark.execution.engine does not exist 2019-11-14 17:13:51,002 WARN conf.HiveConf: HiveConf of name hive.query.timeout.seconds does not exist 2019-11-14 17:13:51,002 WARN conf.HiveConf: HiveConf of name hive.service.metrics.hadoop2.frequency does not exist 2019-11-14 17:13:51,002 WARN conf.HiveConf: HiveConf of name hive.orc.splits.directory.batch.ms does not exist 2019-11-14 17:13:51,004 WARN conf.HiveConf: HiveConf of name hive.metastore.hbase.cache.max.reader.wait does not exist 2019-11-14 17:13:51,004 WARN conf.HiveConf: HiveConf of name hive.llap.task.scheduler.node.reenable.max.timeout.ms does not exist 2019-11-14 17:13:51,004 WARN conf.HiveConf: HiveConf of name hive.max.open.txns does not exist 2019-11-14 17:13:51,004 WARN conf.HiveConf: HiveConf of name hive.auto.convert.sortmerge.join.reduce.side does not exist 2019-11-14 17:13:51,004 WARN conf.HiveConf: HiveConf of name hive.server2.zookeeper.publish.configs does not exist 2019-11-14 17:13:51,004 WARN conf.HiveConf: HiveConf of name hive.auto.convert.join.hashtable.max.entries does not exist 2019-11-14 17:13:51,004 WARN conf.HiveConf: HiveConf of name hive.server2.tez.sessions.init.threads does not exist 2019-11-14 17:13:51,004 WARN conf.HiveConf: HiveConf of name hive.metastore.authorization.storage.check.externaltable.drop does not exist 2019-11-14 17:13:51,004 WARN conf.HiveConf: HiveConf of name hive.execution.mode does not exist 2019-11-14 17:13:51,004 WARN conf.HiveConf: HiveConf of name hive.cbo.cnf.maxnodes does not exist 2019-11-14 17:13:51,004 WARN conf.HiveConf: HiveConf of name hive.vectorized.adaptor.usage.mode does not exist 2019-11-14 17:13:51,005 WARN conf.HiveConf: HiveConf of name hive.materializedview.rewriting does not exist 2019-11-14 17:13:51,005 WARN conf.HiveConf: HiveConf of name hive.server2.authentication.ldap.groupMembershipKey does not exist 2019-11-14 17:13:51,005 WARN conf.HiveConf: HiveConf of name hive.metastore.hbase.catalog.cache.size does not exist 2019-11-14 17:13:51,005 WARN conf.HiveConf: HiveConf of name hive.cbo.show.warnings does not exist 2019-11-14 17:13:51,005 WARN conf.HiveConf: HiveConf of name hive.metastore.fshandler.threads does not exist 2019-11-14 17:13:51,005 WARN conf.HiveConf: HiveConf of name hive.tez.max.bloom.filter.entries does not exist 2019-11-14 17:13:51,005 WARN conf.HiveConf: HiveConf of name hive.llap.io.metadata.fraction does not exist 2019-11-14 17:13:51,005 WARN conf.HiveConf: HiveConf of name hive.materializedview.serde does not exist 2019-11-14 17:13:51,005 WARN conf.HiveConf: HiveConf of name hive.llap.daemon.task.scheduler.wait.queue.size does not exist 2019-11-14 17:13:51,005 WARN conf.HiveConf: HiveConf of name hive.metastore.hbase.aggr.stats.cache.entries does not exist 2019-11-14 17:13:51,005 WARN conf.HiveConf: HiveConf of name hive.txn.operational.properties does not exist 2019-11-14 17:13:51,005 WARN conf.HiveConf: HiveConf of name hive.metastore.hbase.aggr.stats.memory.ttl does not exist 2019-11-14 17:13:51,005 WARN conf.HiveConf: HiveConf of name hive.llap.daemon.rpc.port does not exist 2019-11-14 17:13:51,005 WARN conf.HiveConf: HiveConf of name hive.llap.io.nonvector.wrapper.enabled does not exist 2019-11-14 17:13:51,005 WARN conf.HiveConf: HiveConf of name hive.metastore.hbase.aggregate.stats.cache.size does not exist 2019-11-14 17:13:51,005 WARN conf.HiveConf: HiveConf of name hive.vectorized.use.vectorized.input.format does not exist 2019-11-14 17:13:51,005 WARN conf.HiveConf: HiveConf of name hive.optimize.cte.materialize.threshold does not exist 2019-11-14 17:13:51,005 WARN conf.HiveConf: HiveConf of name hive.metastore.hbase.cache.clean.until does not exist 2019-11-14 17:13:51,005 WARN conf.HiveConf: HiveConf of name hive.optimize.semijoin.conversion does not exist 2019-11-14 17:13:51,005 WARN conf.HiveConf: HiveConf of name hive.metastore.port does not exist 2019-11-14 17:13:51,005 WARN conf.HiveConf: HiveConf of name hive.spark.dynamic.partition.pruning does not exist 2019-11-14 17:13:51,005 WARN conf.HiveConf: HiveConf of name hive.metastore.metrics.enabled does not exist 2019-11-14 17:13:51,005 WARN conf.HiveConf: HiveConf of name hive.repl.rootdir does not exist 2019-11-14 17:13:51,005 WARN conf.HiveConf: HiveConf of name hive.metastore.limit.partition.request does not exist 2019-11-14 17:13:51,005 WARN conf.HiveConf: HiveConf of name hive.async.log.enabled does not exist 2019-11-14 17:13:51,005 WARN conf.HiveConf: HiveConf of name hive.llap.daemon.logger does not exist 2019-11-14 17:13:51,005 WARN conf.HiveConf: HiveConf of name hive.allow.udf.load.on.demand does not exist 2019-11-14 17:13:51,005 WARN conf.HiveConf: HiveConf of name hive.cli.tez.session.async does not exist 2019-11-14 17:13:51,005 WARN conf.HiveConf: HiveConf of name hive.tez.bloom.filter.factor does not exist 2019-11-14 17:13:51,005 WARN conf.HiveConf: HiveConf of name hive.llap.daemon.am-reporter.max.threads does not exist 2019-11-14 17:13:51,005 WARN conf.HiveConf: HiveConf of name hive.spark.use.file.size.for.mapjoin does not exist 2019-11-14 17:13:51,005 WARN conf.HiveConf: HiveConf of name hive.strict.checks.bucketing does not exist 2019-11-14 17:13:51,005 WARN conf.HiveConf: HiveConf of name hive.tez.bucket.pruning.compat does not exist 2019-11-14 17:13:51,005 WARN conf.HiveConf: HiveConf of name hive.server2.webui.spnego.principal does not exist 2019-11-14 17:13:51,005 WARN conf.HiveConf: HiveConf of name hive.llap.daemon.task.preemption.metrics.intervals does not exist 2019-11-14 17:13:51,005 WARN conf.HiveConf: HiveConf of name hive.llap.daemon.shuffle.dir.watcher.enabled does not exist 2019-11-14 17:13:51,005 WARN conf.HiveConf: HiveConf of name hive.llap.io.allocator.arena.count does not exist 2019-11-14 17:13:51,005 WARN conf.HiveConf: HiveConf of name hive.metastore.use.SSL does not exist 2019-11-14 17:13:51,005 WARN conf.HiveConf: HiveConf of name hive.llap.task.communicator.connection.timeout.ms does not exist 2019-11-14 17:13:51,005 WARN conf.HiveConf: HiveConf of name hive.transpose.aggr.join does not exist 2019-11-14 17:13:51,005 WARN conf.HiveConf: HiveConf of name hive.druid.maxTries does not exist 2019-11-14 17:13:51,005 WARN conf.HiveConf: HiveConf of name hive.spark.dynamic.partition.pruning.max.data.size does not exist 2019-11-14 17:13:51,005 WARN conf.HiveConf: HiveConf of name hive.druid.metadata.base does not exist 2019-11-14 17:13:51,005 WARN conf.HiveConf: HiveConf of name hive.metastore.hbase.aggr.stats.invalidator.frequency does not exist 2019-11-14 17:13:51,005 WARN conf.HiveConf: HiveConf of name hive.llap.io.use.lrfu does not exist 2019-11-14 17:13:51,005 WARN conf.HiveConf: HiveConf of name hive.llap.io.allocator.mmap does not exist 2019-11-14 17:13:51,005 WARN conf.HiveConf: HiveConf of name hive.druid.coordinator.address.default does not exist 2019-11-14 17:13:51,005 WARN conf.HiveConf: HiveConf of name hive.server2.thrift.resultset.max.fetch.size does not exist 2019-11-14 17:13:51,005 WARN conf.HiveConf: HiveConf of name hive.conf.hidden.list does not exist 2019-11-14 17:13:51,005 WARN conf.HiveConf: HiveConf of name hive.io.sarg.cache.max.weight.mb does not exist 2019-11-14 17:13:51,005 WARN conf.HiveConf: HiveConf of name hive.server2.clear.dangling.scratchdir.interval does not exist 2019-11-14 17:13:51,006 WARN conf.HiveConf: HiveConf of name hive.druid.sleep.time does not exist 2019-11-14 17:13:51,006 WARN conf.HiveConf: HiveConf of name hive.vectorized.use.row.serde.deserialize does not exist 2019-11-14 17:13:51,006 WARN conf.HiveConf: HiveConf of name hive.server2.compile.lock.timeout does not exist 2019-11-14 17:13:51,006 WARN conf.HiveConf: HiveConf of name hive.timedout.txn.reaper.interval does not exist 2019-11-14 17:13:51,006 WARN conf.HiveConf: HiveConf of name hive.metastore.hbase.aggregate.stats.max.variance does not exist 2019-11-14 17:13:51,006 WARN conf.HiveConf: HiveConf of name hive.llap.io.lrfu.lambda does not exist 2019-11-14 17:13:51,006 WARN conf.HiveConf: HiveConf of name hive.druid.metadata.db.type does not exist 2019-11-14 17:13:51,006 WARN conf.HiveConf: HiveConf of name hive.llap.daemon.output.stream.timeout does not exist 2019-11-14 17:13:51,006 WARN conf.HiveConf: HiveConf of name hive.transactional.events.mem does not exist 2019-11-14 17:13:51,006 WARN conf.HiveConf: HiveConf of name hive.server2.thrift.resultset.default.fetch.size does not exist 2019-11-14 17:13:51,006 WARN conf.HiveConf: HiveConf of name hive.repl.cm.retain does not exist 2019-11-14 17:13:51,006 WARN conf.HiveConf: HiveConf of name hive.merge.cardinality.check does not exist 2019-11-14 17:13:51,006 WARN conf.HiveConf: HiveConf of name hive.server2.authentication.ldap.groupClassKey does not exist 2019-11-14 17:13:51,006 WARN conf.HiveConf: HiveConf of name hive.optimize.point.lookup does not exist 2019-11-14 17:13:51,006 WARN conf.HiveConf: HiveConf of name hive.llap.allow.permanent.fns does not exist 2019-11-14 17:13:51,006 WARN conf.HiveConf: HiveConf of name hive.llap.daemon.web.ssl does not exist 2019-11-14 17:13:51,006 WARN conf.HiveConf: HiveConf of name hive.txn.manager.dump.lock.state.on.acquire.timeout does not exist 2019-11-14 17:13:51,006 WARN conf.HiveConf: HiveConf of name hive.compactor.history.retention.succeeded does not exist 2019-11-14 17:13:51,006 WARN conf.HiveConf: HiveConf of name hive.llap.io.use.fileid.path does not exist 2019-11-14 17:13:51,006 WARN conf.HiveConf: HiveConf of name hive.llap.io.encode.slice.row.count does not exist 2019-11-14 17:13:51,007 WARN conf.HiveConf: HiveConf of name hive.mapjoin.optimized.hashtable.probe.percent does not exist 2019-11-14 17:13:51,007 WARN conf.HiveConf: HiveConf of name hive.druid.select.distribute does not exist 2019-11-14 17:13:51,007 WARN conf.HiveConf: HiveConf of name hive.llap.am.use.fqdn does not exist 2019-11-14 17:13:51,007 WARN conf.HiveConf: HiveConf of name hive.llap.task.scheduler.node.reenable.min.timeout.ms does not exist 2019-11-14 17:13:51,007 WARN conf.HiveConf: HiveConf of name hive.llap.validate.acls does not exist 2019-11-14 17:13:51,007 WARN conf.HiveConf: HiveConf of name hive.support.special.characters.tablename does not exist 2019-11-14 17:13:51,007 WARN conf.HiveConf: HiveConf of name hive.mv.files.thread does not exist 2019-11-14 17:13:51,007 WARN conf.HiveConf: HiveConf of name hive.llap.skip.compile.udf.check does not exist 2019-11-14 17:13:51,007 WARN conf.HiveConf: HiveConf of name hive.llap.io.encode.vector.serde.enabled does not exist 2019-11-14 17:13:51,007 WARN conf.HiveConf: HiveConf of name hive.repl.cm.interval does not exist 2019-11-14 17:13:51,007 WARN conf.HiveConf: HiveConf of name hive.server2.sleep.interval.between.start.attempts does not exist 2019-11-14 17:13:51,007 WARN conf.HiveConf: HiveConf of name hive.llap.daemon.yarn.container.mb does not exist 2019-11-14 17:13:51,007 WARN conf.HiveConf: HiveConf of name hive.druid.http.read.timeout does not exist 2019-11-14 17:13:51,007 WARN conf.HiveConf: HiveConf of name hive.blobstore.optimizations.enabled does not exist 2019-11-14 17:13:51,007 WARN conf.HiveConf: HiveConf of name hive.llap.orc.gap.cache does not exist 2019-11-14 17:13:51,007 WARN conf.HiveConf: HiveConf of name hive.optimize.dynamic.partition.hashjoin does not exist 2019-11-14 17:13:51,007 WARN conf.HiveConf: HiveConf of name hive.exec.copyfile.maxnumfiles does not exist 2019-11-14 17:13:51,007 WARN conf.HiveConf: HiveConf of name hive.llap.io.encode.formats does not exist 2019-11-14 17:13:51,008 WARN conf.HiveConf: HiveConf of name hive.druid.http.numConnection does not exist 2019-11-14 17:13:51,008 WARN conf.HiveConf: HiveConf of name hive.llap.daemon.task.scheduler.enable.preemption does not exist 2019-11-14 17:13:51,008 WARN conf.HiveConf: HiveConf of name hive.llap.daemon.num.executors does not exist 2019-11-14 17:13:51,008 WARN conf.HiveConf: HiveConf of name hive.metastore.hbase.cache.max.full does not exist 2019-11-14 17:13:51,008 WARN conf.HiveConf: HiveConf of name hive.metastore.hbase.connection.class does not exist 2019-11-14 17:13:51,008 WARN conf.HiveConf: HiveConf of name hive.server2.tez.sessions.custom.queue.allowed does not exist 2019-11-14 17:13:51,008 WARN conf.HiveConf: HiveConf of name hive.llap.io.encode.slice.lrr does not exist 2019-11-14 17:13:51,008 WARN conf.HiveConf: HiveConf of name hive.server2.thrift.client.password does not exist 2019-11-14 17:13:51,008 WARN conf.HiveConf: HiveConf of name hive.metastore.hbase.cache.max.writer.wait does not exist 2019-11-14 17:13:51,008 WARN conf.HiveConf: HiveConf of name hive.server2.thrift.http.request.header.size does not exist 2019-11-14 17:13:51,008 WARN conf.HiveConf: HiveConf of name hive.server2.webui.max.threads does not exist 2019-11-14 17:13:51,008 WARN conf.HiveConf: HiveConf of name hive.optimize.limittranspose.reductiontuples does not exist 2019-11-14 17:13:51,008 WARN conf.HiveConf: HiveConf of name hive.test.rollbacktxn does not exist 2019-11-14 17:13:51,008 WARN conf.HiveConf: HiveConf of name hive.llap.task.scheduler.num.schedulable.tasks.per.node does not exist 2019-11-14 17:13:51,008 WARN conf.HiveConf: HiveConf of name hive.llap.daemon.acl does not exist 2019-11-14 17:13:51,008 WARN conf.HiveConf: HiveConf of name hive.llap.io.memory.size does not exist 2019-11-14 17:13:51,008 WARN conf.HiveConf: HiveConf of name hive.strict.checks.type.safety does not exist 2019-11-14 17:13:51,008 WARN conf.HiveConf: HiveConf of name hive.server2.async.exec.async.compile does not exist 2019-11-14 17:13:51,008 WARN conf.HiveConf: HiveConf of name hive.llap.auto.max.input.size does not exist 2019-11-14 17:13:51,008 WARN conf.HiveConf: HiveConf of name hive.tez.enable.memory.manager does not exist 2019-11-14 17:13:51,008 WARN conf.HiveConf: HiveConf of name hive.msck.repair.batch.size does not exist 2019-11-14 17:13:51,008 WARN conf.HiveConf: HiveConf of name hive.blobstore.supported.schemes does not exist 2019-11-14 17:13:51,008 WARN conf.HiveConf: HiveConf of name hive.orc.splits.allow.synthetic.fileid does not exist 2019-11-14 17:13:51,008 WARN conf.HiveConf: HiveConf of name hive.stats.filter.in.factor does not exist 2019-11-14 17:13:51,009 WARN conf.HiveConf: HiveConf of name hive.spark.use.op.stats does not exist 2019-11-14 17:13:51,009 WARN conf.HiveConf: HiveConf of name hive.exec.input.listing.max.threads does not exist 2019-11-14 17:13:51,009 WARN conf.HiveConf: HiveConf of name hive.server2.tez.session.lifetime.jitter does not exist 2019-11-14 17:13:51,009 WARN conf.HiveConf: HiveConf of name hive.llap.daemon.web.port does not exist 2019-11-14 17:13:51,009 WARN conf.HiveConf: HiveConf of name hive.strict.checks.cartesian.product does not exist 2019-11-14 17:13:51,009 WARN conf.HiveConf: HiveConf of name hive.llap.daemon.rpc.num.handlers does not exist 2019-11-14 17:13:51,009 WARN conf.HiveConf: HiveConf of name hive.llap.daemon.vcpus.per.instance does not exist 2019-11-14 17:13:51,009 WARN conf.HiveConf: HiveConf of name hive.count.open.txns.interval does not exist 2019-11-14 17:13:51,009 WARN conf.HiveConf: HiveConf of name hive.tez.min.bloom.filter.entries does not exist 2019-11-14 17:13:51,009 WARN conf.HiveConf: HiveConf of name hive.optimize.partition.columns.separate does not exist 2019-11-14 17:13:51,009 WARN conf.HiveConf: HiveConf of name hive.orc.cache.stripe.details.mem.size does not exist 2019-11-14 17:13:51,009 WARN conf.HiveConf: HiveConf of name hive.txn.heartbeat.threadpool.size does not exist 2019-11-14 17:13:51,009 WARN conf.HiveConf: HiveConf of name hive.llap.task.scheduler.locality.delay does not exist 2019-11-14 17:13:51,009 WARN conf.HiveConf: HiveConf of name hive.repl.cmrootdir does not exist 2019-11-14 17:13:51,009 WARN conf.HiveConf: HiveConf of name hive.llap.task.scheduler.node.disable.backoff.factor does not exist 2019-11-14 17:13:51,009 WARN conf.HiveConf: HiveConf of name hive.llap.am.liveness.connection.sleep.between.retries.ms does not exist 2019-11-14 17:13:51,009 WARN conf.HiveConf: HiveConf of name hive.spark.exec.inplace.progress does not exist 2019-11-14 17:13:51,009 WARN conf.HiveConf: HiveConf of name hive.druid.working.directory does not exist 2019-11-14 17:13:51,009 WARN conf.HiveConf: HiveConf of name hive.llap.daemon.memory.per.instance.mb does not exist 2019-11-14 17:13:51,009 WARN conf.HiveConf: HiveConf of name hive.msck.path.validation does not exist 2019-11-14 17:13:51,009 WARN conf.HiveConf: HiveConf of name hive.tez.task.scale.memory.reserve.fraction does not exist 2019-11-14 17:13:51,009 WARN conf.HiveConf: HiveConf of name hive.merge.nway.joins does not exist 2019-11-14 17:13:51,009 WARN conf.HiveConf: HiveConf of name hive.compactor.history.reaper.interval does not exist 2019-11-14 17:13:51,009 WARN conf.HiveConf: HiveConf of name hive.txn.strict.locking.mode does not exist 2019-11-14 17:13:51,009 WARN conf.HiveConf: HiveConf of name hive.llap.io.encode.vector.serde.async.enabled does not exist 2019-11-14 17:13:51,009 WARN conf.HiveConf: HiveConf of name hive.tez.input.generate.consistent.splits does not exist 2019-11-14 17:13:51,009 WARN conf.HiveConf: HiveConf of name hive.server2.in.place.progress does not exist 2019-11-14 17:13:51,009 WARN conf.HiveConf: HiveConf of name hive.druid.indexer.memory.rownum.max does not exist 2019-11-14 17:13:51,009 WARN conf.HiveConf: HiveConf of name hive.server2.xsrf.filter.enabled does not exist 2019-11-14 17:13:51,009 WARN conf.HiveConf: HiveConf of name hive.llap.io.allocator.alloc.max does not exist Welcome to ____ __ / __/__ ___ _____/ /__ _\ \/ _ \/ _ `/ __/ '_/ /__ / .__/\_,_/_/ /_/\_\ version 2.4.4 /_/ Using Python version 3.7.4 (default, Sep 20 2019 17:49:03) SparkSession available as 'spark'. ```
使用spark2.1进行mysql2hive数据同步job会卡在最后一个task上并不断GC
### 备注: + job 0 :从Mysql中拖数据在hdfs中建立临时表(parquet文件)<br> + job 1 : 将临时表中的数据写入Hive中。<br> + spark 1.6的parquet压缩格式默认为gzip。<br> + spark 2.1的parquet压缩格式默认为snappy。<br> ### 问题描述: + Mario依赖spark 1.6时,如表A(约3GB)运行流畅,执行时间 < 15Min. <br> + Mario依赖spark 2.1时,如表A(约3GB)运行卡顿,执行时间>30 Min。原因是job 1会卡在最后一个task上并不断进行GC。<br> ### 解决方案: + 将spark 2.1中parquet的压缩格式由snappy改为gzip后,问题解决。<br> + 将spark1.6中parquet的压缩格式改为snappy,顺利执行,并未卡顿。 ### 有待研究 + 压缩格式产生如此影响的内在机制是什么?
大数据hive分区表导入数据的问题
分区表导入数据load data local inpath '/opt/datas/distdata/emp.txt' into table emp_partition partition(month='201512');我修改了mysql的字符集:alter database hive character set latin1;报错如下:![图片说明](https://img-ask.csdn.net/upload/201801/07/1515328517_205141.png) ![图片说明](https://img-ask.csdn.net/upload/201801/07/1515328530_91016.png) 不知道这个问题是怎么回事 文件也上传上去了 select查询就查不出来
python 处理数据hive表 分流的的问题
python 3.6,pandas 库 一张hive表,数据3000万 ,10个字段, 将hive表中的3000万数据 ,按前两个字段分组, 将每一个组的数据分别写到csv文件中去,(分组数级大概100000个) 求代码
数据仓库hive面试题求解
hive中,多个人同时操作一个表,该如何处理?求详解,脑子想不过来,搜索也搜不到,真是无奈呢,有没有会的老哥
如何通过大数据算法分析无固定分析方向的数据
1.有大量数据,只有数据没有其他信息。在不知道取得什么结果的情况下,分析数据,进行聚类、回归等计算,算出来啥就是啥。数据有表头,但是电力专业化程度较高。要求仅通过数据进行分析。(客户给出的限制)这些数据我都分析不出来最基本的K-means算法分几类,或者看不懂是否含有线性回归的关系,客户说有,我看不出来。 2.我没学过大数据和算法,目前做到的只是搭建起来hadoop、spark、hive之类的,java写MapReduce需要算法基础,PySpark一样需要先了解算法,在了解API。 3.希望有兴趣的大佬帮忙做做看。you chang的哦,详情私聊。
hive随机抽取数据,保证数据随机性
在hive中随机抽取1000条数据,保证数据的随机性,确保两次抽取数据的不一致。
hive执行正确,hive-jdbc 别名 报语法错误
SQL在hive中执行是正确的,但是使用hive-jdbc ResultSet rs = st.executeQuery(sql) 执行,却在别名处报语法错误 ![图片说明](https://img-ask.csdn.net/upload/202001/03/1578040942_99148.png) SQL: SELECT aa.customerid FROM ( (SELECT customerid FROM oder WHERE saleno = 101870 AND orderstatus NOT IN (1000, 1007, 1008) AND obcustomertype != 1004 AND source = 1001 AND zipcode != 'null') as `aa` LEFT JOIN (SELECT customerid FROM oder WHERE saleno IN ( 101345, 101955, 101000, 101099, 101362 ) AND orderstatus NOT IN (1000, 1007, 1008) AND obcustomertype != 1004 AND source = 1001 AND zipcode != 'null') as `bb` ON aa.customerid = bb.customerid ) WHERE bb.customerid IS NULL
Java学习的正确打开方式
在博主认为,对于入门级学习java的最佳学习方法莫过于视频+博客+书籍+总结,前三者博主将淋漓尽致地挥毫于这篇博客文章中,至于总结在于个人,实际上越到后面你会发现学习的最好方式就是阅读参考官方文档其次就是国内的书籍,博客次之,这又是一个层次了,这里暂时不提后面再谈。博主将为各位入门java保驾护航,各位只管冲鸭!!!上天是公平的,只要不辜负时间,时间自然不会辜负你。 何谓学习?博主所理解的学习,它是一个过程,是一个不断累积、不断沉淀、不断总结、善于传达自己的个人见解以及乐于分享的过程。
程序员必须掌握的核心算法有哪些?
由于我之前一直强调数据结构以及算法学习的重要性,所以就有一些读者经常问我,数据结构与算法应该要学习到哪个程度呢?,说实话,这个问题我不知道要怎么回答你,主要取决于你想学习到哪些程度,不过针对这个问题,我稍微总结一下我学过的算法知识点,以及我觉得值得学习的算法。这些算法与数据结构的学习大多数是零散的,并没有一本把他们全部覆盖的书籍。下面是我觉得值得学习的一些算法以及数据结构,当然,我也会整理一些看过...
前端 | 2. 正则
转载请注明以下: 本文转自清自以敬的博客:https://blog.csdn.net/qq_45791147 文章目录1.转义2.正则表达式初步2.1.匹配字符2.1.1.组成元素2.1.2.基础正则的设计 1.转义 转义的作用: 当某个字符在表达式中具有特殊含义,例如字符串引号中出现了引号,为了可以使用这些字符本身,而不是使用其在表达式中的特殊含义,则需要通过转义符“\”来构建该字符转义...
有哪些让程序员受益终生的建议
从业五年多,辗转两个大厂,出过书,创过业,从技术小白成长为基层管理,联合几个业内大牛回答下这个问题,希望能帮到大家,记得帮我点赞哦。 敲黑板!!!读了这篇文章,你将知道如何才能进大厂,如何实现财务自由,如何在工作中游刃有余,这篇文章很长,但绝对是精品,记得帮我点赞哦!!!! 一腔肺腑之言,能看进去多少,就看你自己了!!! 目录: 在校生篇: 为什么要尽量进大厂? 如何选择语言及方...
大学四年自学走来,这些私藏的实用工具/学习网站我贡献出来了
大学四年,看课本是不可能一直看课本的了,对于学习,特别是自学,善于搜索网上的一些资源来辅助,还是非常有必要的,下面我就把这几年私藏的各种资源,网站贡献出来给你们。主要有:电子书搜索、实用工具、在线视频学习网站、非视频学习网站、软件下载、面试/求职必备网站。 注意:文中提到的所有资源,文末我都给你整理好了,你们只管拿去,如果觉得不错,转发、分享就是最大的支持了。 一、电子书搜索 对于大部分程序员...
linux系列之常用运维命令整理笔录
本博客记录工作中需要的linux运维命令,大学时候开始接触linux,会一些基本操作,可是都没有整理起来,加上是做开发,不做运维,有些命令忘记了,所以现在整理成博客,当然vi,文件操作等就不介绍了,慢慢积累一些其它拓展的命令,博客不定时更新 free -m 其中:m表示兆,也可以用g,注意都要小写 Men:表示物理内存统计 total:表示物理内存总数(total=used+free) use...
比特币原理详解
一、什么是比特币 比特币是一种电子货币,是一种基于密码学的货币,在2008年11月1日由中本聪发表比特币白皮书,文中提出了一种去中心化的电子记账系统,我们平时的电子现金是银行来记账,因为银行的背后是国家信用。去中心化电子记账系统是参与者共同记账。比特币可以防止主权危机、信用风险。其好处不多做赘述,这一层面介绍的文章很多,本文主要从更深层的技术原理角度进行介绍。 二、问题引入 假设现有4个人...
程序员接私活怎样防止做完了不给钱?
首先跟大家说明一点,我们做 IT 类的外包开发,是非标品开发,所以很有可能在开发过程中会有这样那样的需求修改,而这种需求修改很容易造成扯皮,进而影响到费用支付,甚至出现做完了项目收不到钱的情况。 那么,怎么保证自己的薪酬安全呢? 我们在开工前,一定要做好一些证据方面的准备(也就是“讨薪”的理论依据),这其中最重要的就是需求文档和验收标准。一定要让需求方提供这两个文档资料作为开发的基础。之后开发...
网页实现一个简单的音乐播放器(大佬别看。(⊙﹏⊙))
今天闲着无事,就想写点东西。然后听了下歌,就打算写个播放器。 于是乎用h5 audio的加上js简单的播放器完工了。 演示地点演示 html代码如下` music 这个年纪 七月的风 音乐 ` 然后就是css`*{ margin: 0; padding: 0; text-decoration: none; list-...
Python十大装B语法
Python 是一种代表简单思想的语言,其语法相对简单,很容易上手。不过,如果就此小视 Python 语法的精妙和深邃,那就大错特错了。本文精心筛选了最能展现 Python 语法之精妙的十个知识点,并附上详细的实例代码。如能在实战中融会贯通、灵活使用,必将使代码更为精炼、高效,同时也会极大提升代码B格,使之看上去更老练,读起来更优雅。
数据库优化 - SQL优化
以实际SQL入手,带你一步一步走上SQL优化之路!
2019年11月中国大陆编程语言排行榜
2019年11月2日,我统计了某招聘网站,获得有效程序员招聘数据9万条。针对招聘信息,提取编程语言关键字,并统计如下: 编程语言比例 rank pl_ percentage 1 java 33.62% 2 cpp 16.42% 3 c_sharp 12.82% 4 javascript 12.31% 5 python 7.93% 6 go 7.25% 7 p...
通俗易懂地给女朋友讲:线程池的内部原理
餐盘在灯光的照耀下格外晶莹洁白,女朋友拿起红酒杯轻轻地抿了一小口,对我说:“经常听你说线程池,到底线程池到底是个什么原理?”
《奇巧淫技》系列-python!!每天早上八点自动发送天气预报邮件到QQ邮箱
将代码部署服务器,每日早上定时获取到天气数据,并发送到邮箱。 也可以说是一个小型人工智障。 知识可以运用在不同地方,不一定非是天气预报。
经典算法(5)杨辉三角
杨辉三角 是经典算法,这篇博客对它的算法思想进行了讲解,并有完整的代码实现。
腾讯算法面试题:64匹马8个跑道需要多少轮才能选出最快的四匹?
昨天,有网友私信我,说去阿里面试,彻底的被打击到了。问了为什么网上大量使用ThreadLocal的源码都会加上private static?他被难住了,因为他从来都没有考虑过这个问题。无独有偶,今天笔者又发现有网友吐槽了一道腾讯的面试题,我们一起来看看。 腾讯算法面试题:64匹马8个跑道需要多少轮才能选出最快的四匹? 在互联网职场论坛,一名程序员发帖求助到。二面腾讯,其中一个算法题:64匹...
面试官:你连RESTful都不知道我怎么敢要你?
干货,2019 RESTful最贱实践
Docker 从入门到掉坑
Docker 介绍 简单的对docker进行介绍,可以把它理解为一个应用程序执行的容器。但是docker本身和虚拟机还是有较为明显的出入的。我大致归纳了一下,可以总结为以下几点: docker自身也有着很多的优点,关于它的优点,可以总结为以下几项: 安装docker 从 2017 年 3 月开始 docker 在原来的基础上分为两个分支版本: Docker CE 和 Doc...
为啥国人偏爱Mybatis,而老外喜欢Hibernate/JPA呢?
关于SQL和ORM的争论,永远都不会终止,我也一直在思考这个问题。昨天又跟群里的小伙伴进行了一番讨论,感触还是有一些,于是就有了今天这篇文。 声明:本文不会下关于Mybatis和JPA两个持久层框架哪个更好这样的结论。只是摆事实,讲道理,所以,请各位看官勿喷。 一、事件起因 关于Mybatis和JPA孰优孰劣的问题,争论已经很多年了。一直也没有结论,毕竟每个人的喜好和习惯是大不相同的。我也看...
白话阿里巴巴Java开发手册高级篇
不久前,阿里巴巴发布了《阿里巴巴Java开发手册》,总结了阿里巴巴内部实际项目开发过程中开发人员应该遵守的研发流程规范,这些流程规范在一定程度上能够保证最终的项目交付质量,通过在时间中总结模式,并推广给广大开发人员,来避免研发人员在实践中容易犯的错误,确保最终在大规模协作的项目中达成既定目标。 无独有偶,笔者去年在公司里负责升级和制定研发流程、设计模板、设计标准、代码标准等规范,并在实际工作中进行...
SQL-小白最佳入门sql查询一
不要偷偷的查询我的个人资料,即使你再喜欢我,也不要这样,真的不好;
项目中的if else太多了,该怎么重构?
介绍 最近跟着公司的大佬开发了一款IM系统,类似QQ和微信哈,就是聊天软件。我们有一部分业务逻辑是这样的 if (msgType = "文本") { // dosomething } else if(msgType = "图片") { // doshomething } else if(msgType = "视频") { // doshomething } else { // doshom...
Nginx 原理和架构
Nginx 是一个免费的,开源的,高性能的 HTTP 服务器和反向代理,以及 IMAP / POP3 代理服务器。Nginx 以其高性能,稳定性,丰富的功能,简单的配置和低资源消耗而闻名。 Nginx 的整体架构 Nginx 里有一个 master 进程和多个 worker 进程。master 进程并不处理网络请求,主要负责调度工作进程:加载配置、启动工作进程及非停升级。worker 进程负责处...
Python 编程开发 实用经验和技巧
Python是一门很灵活的语言,也有很多实用的方法,有时候实现一个功能可以用多种方法实现,我这里总结了一些常用的方法和技巧,包括小数保留指定位小数、判断变量的数据类型、类方法@classmethod、制表符中文对齐、遍历字典、datetime.timedelta的使用等,会持续更新......
YouTube排名第一的励志英文演讲《Dream(梦想)》
Idon’t know what that dream is that you have, I don't care how disappointing it might have been as you've been working toward that dream,but that dream that you’re holding in your mind, that it’s po...
“狗屁不通文章生成器”登顶GitHub热榜,分分钟写出万字形式主义大作
一、垃圾文字生成器介绍 最近在浏览GitHub的时候,发现了这样一个骨骼清奇的雷人项目,而且热度还特别高。 项目中文名:狗屁不通文章生成器 项目英文名:BullshitGenerator 根据作者的介绍,他是偶尔需要一些中文文字用于GUI开发时测试文本渲染,因此开发了这个废话生成器。但由于生成的废话实在是太过富于哲理,所以最近已经被小伙伴们给玩坏了。 他的文风可能是这样的: 你发现,...
程序员:我终于知道post和get的区别
是一个老生常谈的话题,然而随着不断的学习,对于以前的认识有很多误区,所以还是需要不断地总结的,学而时习之,不亦说乎
《程序人生》系列-这个程序员只用了20行代码就拿了冠军
你知道的越多,你不知道的越多 点赞再看,养成习惯GitHub上已经开源https://github.com/JavaFamily,有一线大厂面试点脑图,欢迎Star和完善 前言 这一期不算《吊打面试官》系列的,所有没前言我直接开始。 絮叨 本来应该是没有这期的,看过我上期的小伙伴应该是知道的嘛,双十一比较忙嘛,要值班又要去帮忙拍摄年会的视频素材,还得搞个程序员一天的Vlog,还要写BU...
程序员把地府后台管理系统做出来了,还有3.0版本!12月7号最新消息:已在开发中有github地址
第一幕:缘起 听说阎王爷要做个生死簿后台管理系统,我们派去了一个程序员…… 996程序员做的梦: 第一场:团队招募 为了应对地府管理危机,阎王打算找“人”开发一套地府后台管理系统,于是就在地府总经办群中发了项目需求。 话说还是中国电信的信号好,地府都是满格,哈哈!!! 经常会有外行朋友问:看某网站做的不错,功能也简单,你帮忙做一下? 而这次,面对这样的需求,这个程序员...
网易云6亿用户音乐推荐算法
网易云音乐是音乐爱好者的集聚地,云音乐推荐系统致力于通过 AI 算法的落地,实现用户千人千面的个性化推荐,为用户带来不一样的听歌体验。 本次分享重点介绍 AI 算法在音乐推荐中的应用实践,以及在算法落地过程中遇到的挑战和解决方案。 将从如下两个部分展开: AI算法在音乐推荐中的应用 音乐场景下的 AI 思考 从 2013 年 4 月正式上线至今,网易云音乐平台持续提供着:乐屏社区、UGC...
相关热词 c# clr dll c# 如何orm c# 固定大小的字符数组 c#框架设计 c# 删除数据库 c# 中文文字 图片转 c# 成员属性 接口 c#如何将程序封装 16进制负数转换 c# c#练手项目
立即提问