【问题详细描述】
spark(自带hive)读取不了主子表的数据,非主表的数据可以读取。spark版本:spark-1.3.0-bin-hadoop2.4
使用的jar包:
spark-sequoiadb-1.12.jar
sequoiadb-driver-1.12.jar
hadoop-sequoiadb-1.12.jar
hive-sequoiadb-1.12.jar
postgresql-9.4-1201-jdbc41.jar
查询主表错误如下:
select * from test201607_cs.tb_order limit 1 ;
Error: org.apache.spark.SparkException: Job aborted due to stage failure: Task 0 in stage 16.0 failed 4 times, most recent failure: Lost task 0.3 in stage 16.0 (TID 362, sdb-223.3golden.hq): com.sequoiadb.exception.BaseException: errorType:SDB_DMS_CS_NOTEXIST,Collection space does not exist
Exception Detail:test201607_cs
at com.sequoiadb.base.Sequoiadb.getCollectionSpace(Sequoiadb.java:598)
at com.sequoiadb.hive.SdbReader.(SdbReader.java:145)
at com.sequoiadb.hive.SdbHiveInputFormat.getRecordReader(SdbHiveInputFormat.java:120)
at org.apache.spark.rdd.HadoopRDD$anon$1.(HadoopRDD.scala:236)
at org.apache.spark.rdd.HadoopRDD.compute(HadoopRDD.scala:212)
复制代码
查询非主表结果:
select * from test201607_cs.test_hive limit 1 ;
+----------+
| shop_id |
+----------+
| 10048 |
+----------+