冬哥不是东哥 2019-04-28 12:04 采纳率: 0%
浏览 332

shc框架将dataframe写入Hbase int型会乱码

Python用shc框架将dataframe写入Hbase int型会乱码,而且读回df同样乱码

df_test_HBase = sql_sc.read.format('jdbc').options(url=jdbc_url_test,driver=jdbc_driver,dbtable='testHBase').load()
df_test_HBase.createOrReplaceTempView("test_HBase")
df_cast_HBase = sql_sc.sql("select CAST(id as String) id,name,CAST(age as String) age,CAST(gender as String) gender,cat,tag,level from test_HBase")
df_cast_HBase.show()
dep = "org.apache.spark.sql.execution.datasources.hbase"

catalog = """{
              "table":{"namespace":"default", "name":"teacher", "tableCoder":"PrimitiveType"},
              "rowkey":"key",
              "columns":{
                       "id":{"cf":"rowkey", "col":"key", "type":"string"},
                       "name":{"cf":"teacherBase", "col":"name", "type":"string"},
                       "age":{"cf":"teacherBase", "col":"age", "type":"string"},
                       "gender":{"cf":"teacherBase", "col":"gender","type":"string"},
                       "cat":{"cf":"teacherDetails", "col":"cat","type":"string"},
                       "tag":{"cf":"teacherDetails", "col":"tag", "type":"string"},
                       "level":{"cf":"teacherDetails", "col":"level","type":"string"}  }
            } """
df_cast_HBase.write.options(catalog=catalog,newTable="5").format(dep).save()

我目前只能通过Cast函数将int转成String并且把catalog中type为int也改成string后写入HBase才不乱码,但只是治标不治本,求大神给个治本的解决办法!!
前后对比图:
图片说明

  • 写回答

0条回答 默认 最新

    报告相同问题?

    悬赏问题

    • ¥15 #MATLAB仿真#车辆换道路径规划
    • ¥15 java 操作 elasticsearch 8.1 实现 索引的重建
    • ¥15 数据可视化Python
    • ¥15 要给毕业设计添加扫码登录的功能!!有偿
    • ¥15 kafka 分区副本增加会导致消息丢失或者不可用吗?
    • ¥15 微信公众号自制会员卡没有收款渠道啊
    • ¥100 Jenkins自动化部署—悬赏100元
    • ¥15 关于#python#的问题:求帮写python代码
    • ¥20 MATLAB画图图形出现上下震荡的线条
    • ¥15 关于#windows#的问题:怎么用WIN 11系统的电脑 克隆WIN NT3.51-4.0系统的硬盘