seagle01 2016-08-29 07:38 采纳率: 20%
浏览 6018
已结题

spark sql 查询hive中的数据,查询结果全部为null

16/08/29 15:32:46 INFO ParseDriver: Parsing command: FROM dim_shop SELECT koubei_id,customer_id,koubei_customer_pid,first_cat_id,second_cat_id,third_cat_id,owner_id,shop_sour
ce,transferred_out where dt = '20160130'
16/08/29 15:32:46 INFO ParseDriver: Parse Completed
16/08/29 15:32:47 INFO MemoryStore: ensureFreeSpace(428968) called with curMem=1497706, maxMem=556038881
16/08/29 15:32:47 INFO MemoryStore: Block broadcast_8 stored as values in memory (estimated size 418.9 KB, free 528.4 MB)
16/08/29 15:32:47 INFO MemoryStore: ensureFreeSpace(46219) called with curMem=1926674, maxMem=556038881
16/08/29 15:32:47 INFO MemoryStore: Block broadcast_8_piece0 stored as bytes in memory (estimated size 45.1 KB, free 528.4 MB)
16/08/29 15:32:47 INFO BlockManagerInfo: Added broadcast_8_piece0 in memory on 10.100.24.30:57113 (size: 45.1 KB, free: 530.1 MB)
16/08/29 15:32:47 INFO SparkContext: Created broadcast 8 from show at TmpAliTradSchema.scala:53
16/08/29 15:32:47 INFO FileInputFormat: Total input paths to process : 1
16/08/29 15:32:47 INFO NetworkTopology: Adding a new node: /default/10.100.24.30:50010
16/08/29 15:32:47 INFO NetworkTopology: Adding a new node: /default/10.100.24.10:50010
16/08/29 15:32:47 INFO NetworkTopology: Adding a new node: /default/10.100.24.29:50010
16/08/29 15:32:47 INFO SparkContext: Starting job: show at TmpAliTradSchema.scala:53
16/08/29 15:32:47 INFO DAGScheduler: Got job 5 (show at TmpAliTradSchema.scala:53) with 1 output partitions
16/08/29 15:32:47 INFO DAGScheduler: Final stage: ResultStage 5(show at TmpAliTradSchema.scala:53)
16/08/29 15:32:47 INFO DAGScheduler: Parents of final stage: List()
16/08/29 15:32:47 INFO DAGScheduler: Missing parents: List()
16/08/29 15:32:47 INFO DAGScheduler: Submitting ResultStage 5 (MapPartitionsRDD[25] at show at TmpAliTradSchema.scala:53), which has no missing parents
16/08/29 15:32:47 INFO MemoryStore: ensureFreeSpace(14504) called with curMem=1972893, maxMem=556038881
16/08/29 15:32:47 INFO MemoryStore: Block broadcast_9 stored as values in memory (estimated size 14.2 KB, free 528.4 MB)
16/08/29 15:32:47 INFO MemoryStore: ensureFreeSpace(5576) called with curMem=1987397, maxMem=556038881
16/08/29 15:32:47 INFO MemoryStore: Block broadcast_9_piece0 stored as bytes in memory (estimated size 5.4 KB, free 528.4 MB)
16/08/29 15:32:47 INFO BlockManagerInfo: Added broadcast_9_piece0 in memory on 10.100.24.30:57113 (size: 5.4 KB, free: 530.1 MB)
16/08/29 15:32:47 INFO SparkContext: Created broadcast 9 from broadcast at DAGScheduler.scala:861
16/08/29 15:32:47 INFO DAGScheduler: Submitting 1 missing tasks from ResultStage 5 (MapPartitionsRDD[25] at show at TmpAliTradSchema.scala:53)
16/08/29 15:32:47 INFO YarnScheduler: Adding task set 5.0 with 1 tasks
16/08/29 15:32:47 INFO TaskSetManager: Starting task 0.0 in stage 5.0 (TID 5, datanode162.hadoop, partition 0,NODE_LOCAL, 2443 bytes)
16/08/29 15:32:47 INFO BlockManagerInfo: Added broadcast_9_piece0 in memory on datanode162.hadoop:38271 (size: 5.4 KB, free: 530.1 MB)
16/08/29 15:32:47 INFO BlockManagerInfo: Added broadcast_8_piece0 in memory on datanode162.hadoop:38271 (size: 45.1 KB, free: 530.1 MB)
16/08/29 15:32:47 INFO DAGScheduler: ResultStage 5 (show at TmpAliTradSchema.scala:53) finished in 0.199 s
16/08/29 15:32:47 INFO TaskSetManager: Finished task 0.0 in stage 5.0 (TID 5) in 202 ms on datanode162.hadoop (1/1)
16/08/29 15:32:47 INFO DAGScheduler: Job 5 finished: show at TmpAliTradSchema.scala:53, took 0.251634 s
16/08/29 15:32:47 INFO YarnScheduler: Removed TaskSet 5.0, whose tasks have all completed, from pool
+---------+-----------+-------------------+------------+-------------+------------+--------+-----------+---------------+
|koubei_id|customer_id|koubei_customer_pid|first_cat_id|second_cat_id|third_cat_id|owner_id|shop_source|transferred_out|
+---------+-----------+-------------------+------------+-------------+------------+--------+-----------+---------------+
|null |null |null |null |null |null |null |null |null |
|null |null |null |null |null |null |null |null |null |
|null |null |null |null |null |null |null |null |null |
|null |null |null |null |null |null |null |null |null |
|null |null |null |null |null |null |null |null |null |
|null |null |null |null |null |null |null |null |null |
|null |null |null |null |null |null |null |null |null |
|null |null |null |null |null |null |null |null |null |
|null |null |null |null |null |null |null |null |null |
|null |null |null |null |null |null |null |null |null |
|null |null |null |null |null |null |null |null |null |
|null |null |null |null |null |null |null |null |null |
|null |null |null |null |null |null |null |null |null |
|null |null |null |null |null |null |null |null |null |
|null |null |null |null |null |null |null |null |null |
|null |null |null |null |null |null |null |null |null |
|null |null |null |null |null |null |null |null |null |
|null |null |null |null |null |null |null |null |null |
|null |null |null |null |null |null |null |null |null |
|null |null |null |null |null |null |null |null |null |
+---------+-----------+-------------------+------------+-------------+------------+--------+-----------+---------------+
only showing top 20 rows

目前安装的spark 是CDH 5.5带的1.5.2版本,只有在hive进行分区,且指定分隔符不为默认的\001才会出现该问题

  • 写回答

5条回答 默认 最新

  • Mioopoi 2019-08-30 17:03
    关注

    试试下面这个设置

    set spark.sql.hive.convertMetastoreParquet=false
    
    评论

报告相同问题?

悬赏问题

  • ¥15 如何让企业微信机器人实现消息汇总整合
  • ¥50 关于#ui#的问题:做yolov8的ui界面出现的问题
  • ¥15 如何用Python爬取各高校教师公开的教育和工作经历
  • ¥15 TLE9879QXA40 电机驱动
  • ¥20 对于工程问题的非线性数学模型进行线性化
  • ¥15 Mirare PLUS 进行密钥认证?(详解)
  • ¥15 物体双站RCS和其组成阵列后的双站RCS关系验证
  • ¥20 想用ollama做一个自己的AI数据库
  • ¥15 关于qualoth编辑及缝合服装领子的问题解决方案探寻
  • ¥15 请问怎么才能复现这样的图呀