冬哥不是东哥 2019-04-28 12:04 采纳率: 0%
浏览 332

shc框架将dataframe写入Hbase int型会乱码

Python用shc框架将dataframe写入Hbase int型会乱码,而且读回df同样乱码

df_test_HBase = sql_sc.read.format('jdbc').options(url=jdbc_url_test,driver=jdbc_driver,dbtable='testHBase').load()
df_test_HBase.createOrReplaceTempView("test_HBase")
df_cast_HBase = sql_sc.sql("select CAST(id as String) id,name,CAST(age as String) age,CAST(gender as String) gender,cat,tag,level from test_HBase")
df_cast_HBase.show()
dep = "org.apache.spark.sql.execution.datasources.hbase"

catalog = """{
              "table":{"namespace":"default", "name":"teacher", "tableCoder":"PrimitiveType"},
              "rowkey":"key",
              "columns":{
                       "id":{"cf":"rowkey", "col":"key", "type":"string"},
                       "name":{"cf":"teacherBase", "col":"name", "type":"string"},
                       "age":{"cf":"teacherBase", "col":"age", "type":"string"},
                       "gender":{"cf":"teacherBase", "col":"gender","type":"string"},
                       "cat":{"cf":"teacherDetails", "col":"cat","type":"string"},
                       "tag":{"cf":"teacherDetails", "col":"tag", "type":"string"},
                       "level":{"cf":"teacherDetails", "col":"level","type":"string"}  }
            } """
df_cast_HBase.write.options(catalog=catalog,newTable="5").format(dep).save()

我目前只能通过Cast函数将int转成String并且把catalog中type为int也改成string后写入HBase才不乱码,但只是治标不治本,求大神给个治本的解决办法!!
前后对比图:
图片说明

  • 写回答

0条回答

    报告相同问题?

    悬赏问题

    • ¥15 运筹学中在线排序的时间在线排序的在线LPT算法
    • ¥30 求一段fortran代码用IVF编译运行的结果
    • ¥15 深度学习根据CNN网络模型,搭建BP模型并训练MNIST数据集
    • ¥15 lammps拉伸应力应变曲线分析
    • ¥15 C++ 头文件/宏冲突问题解决
    • ¥15 用comsol模拟大气湍流通过底部加热(温度不同)的腔体
    • ¥50 安卓adb backup备份子用户应用数据失败
    • ¥20 有人能用聚类分析帮我分析一下文本内容嘛
    • ¥15 请问Lammps做复合材料拉伸模拟,应力应变曲线问题
    • ¥30 python代码,帮调试,帮帮忙吧