16/08/29 15:32:46 INFO ParseDriver: Parsing command: FROM dim_shop SELECT koubei_id,customer_id,koubei_customer_pid,first_cat_id,second_cat_id,third_cat_id,owner_id,shop_sour
ce,transferred_out where dt = '20160130'
16/08/29 15:32:46 INFO ParseDriver: Parse Completed
16/08/29 15:32:47 INFO MemoryStore: ensureFreeSpace(428968) called with curMem=1497706, maxMem=556038881
16/08/29 15:32:47 INFO MemoryStore: Block broadcast_8 stored as values in memory (estimated size 418.9 KB, free 528.4 MB)
16/08/29 15:32:47 INFO MemoryStore: ensureFreeSpace(46219) called with curMem=1926674, maxMem=556038881
16/08/29 15:32:47 INFO MemoryStore: Block broadcast_8_piece0 stored as bytes in memory (estimated size 45.1 KB, free 528.4 MB)
16/08/29 15:32:47 INFO BlockManagerInfo: Added broadcast_8_piece0 in memory on 10.100.24.30:57113 (size: 45.1 KB, free: 530.1 MB)
16/08/29 15:32:47 INFO SparkContext: Created broadcast 8 from show at TmpAliTradSchema.scala:53
16/08/29 15:32:47 INFO FileInputFormat: Total input paths to process : 1
16/08/29 15:32:47 INFO NetworkTopology: Adding a new node: /default/10.100.24.30:50010
16/08/29 15:32:47 INFO NetworkTopology: Adding a new node: /default/10.100.24.10:50010
16/08/29 15:32:47 INFO NetworkTopology: Adding a new node: /default/10.100.24.29:50010
16/08/29 15:32:47 INFO SparkContext: Starting job: show at TmpAliTradSchema.scala:53
16/08/29 15:32:47 INFO DAGScheduler: Got job 5 (show at TmpAliTradSchema.scala:53) with 1 output partitions
16/08/29 15:32:47 INFO DAGScheduler: Final stage: ResultStage 5(show at TmpAliTradSchema.scala:53)
16/08/29 15:32:47 INFO DAGScheduler: Parents of final stage: List()
16/08/29 15:32:47 INFO DAGScheduler: Missing parents: List()
16/08/29 15:32:47 INFO DAGScheduler: Submitting ResultStage 5 (MapPartitionsRDD[25] at show at TmpAliTradSchema.scala:53), which has no missing parents
16/08/29 15:32:47 INFO MemoryStore: ensureFreeSpace(14504) called with curMem=1972893, maxMem=556038881
16/08/29 15:32:47 INFO MemoryStore: Block broadcast_9 stored as values in memory (estimated size 14.2 KB, free 528.4 MB)
16/08/29 15:32:47 INFO MemoryStore: ensureFreeSpace(5576) called with curMem=1987397, maxMem=556038881
16/08/29 15:32:47 INFO MemoryStore: Block broadcast_9_piece0 stored as bytes in memory (estimated size 5.4 KB, free 528.4 MB)
16/08/29 15:32:47 INFO BlockManagerInfo: Added broadcast_9_piece0 in memory on 10.100.24.30:57113 (size: 5.4 KB, free: 530.1 MB)
16/08/29 15:32:47 INFO SparkContext: Created broadcast 9 from broadcast at DAGScheduler.scala:861
16/08/29 15:32:47 INFO DAGScheduler: Submitting 1 missing tasks from ResultStage 5 (MapPartitionsRDD[25] at show at TmpAliTradSchema.scala:53)
16/08/29 15:32:47 INFO YarnScheduler: Adding task set 5.0 with 1 tasks
16/08/29 15:32:47 INFO TaskSetManager: Starting task 0.0 in stage 5.0 (TID 5, datanode162.hadoop, partition 0,NODE_LOCAL, 2443 bytes)
16/08/29 15:32:47 INFO BlockManagerInfo: Added broadcast_9_piece0 in memory on datanode162.hadoop:38271 (size: 5.4 KB, free: 530.1 MB)
16/08/29 15:32:47 INFO BlockManagerInfo: Added broadcast_8_piece0 in memory on datanode162.hadoop:38271 (size: 45.1 KB, free: 530.1 MB)
16/08/29 15:32:47 INFO DAGScheduler: ResultStage 5 (show at TmpAliTradSchema.scala:53) finished in 0.199 s
16/08/29 15:32:47 INFO TaskSetManager: Finished task 0.0 in stage 5.0 (TID 5) in 202 ms on datanode162.hadoop (1/1)
16/08/29 15:32:47 INFO DAGScheduler: Job 5 finished: show at TmpAliTradSchema.scala:53, took 0.251634 s
16/08/29 15:32:47 INFO YarnScheduler: Removed TaskSet 5.0, whose tasks have all completed, from pool
+---------+-----------+-------------------+------------+-------------+------------+--------+-----------+---------------+
|koubei_id|customer_id|koubei_customer_pid|first_cat_id|second_cat_id|third_cat_id|owner_id|shop_source|transferred_out|
+---------+-----------+-------------------+------------+-------------+------------+--------+-----------+---------------+
|null |null |null |null |null |null |null |null |null |
|null |null |null |null |null |null |null |null |null |
|null |null |null |null |null |null |null |null |null |
|null |null |null |null |null |null |null |null |null |
|null |null |null |null |null |null |null |null |null |
|null |null |null |null |null |null |null |null |null |
|null |null |null |null |null |null |null |null |null |
|null |null |null |null |null |null |null |null |null |
|null |null |null |null |null |null |null |null |null |
|null |null |null |null |null |null |null |null |null |
|null |null |null |null |null |null |null |null |null |
|null |null |null |null |null |null |null |null |null |
|null |null |null |null |null |null |null |null |null |
|null |null |null |null |null |null |null |null |null |
|null |null |null |null |null |null |null |null |null |
|null |null |null |null |null |null |null |null |null |
|null |null |null |null |null |null |null |null |null |
|null |null |null |null |null |null |null |null |null |
|null |null |null |null |null |null |null |null |null |
|null |null |null |null |null |null |null |null |null |
+---------+-----------+-------------------+------------+-------------+------------+--------+-----------+---------------+
only showing top 20 rows
目前安装的spark 是CDH 5.5带的1.5.2版本,只有在hive进行分区,且指定分隔符不为默认的\001才会出现该问题