科大人 2019-05-06 20:32 采纳率: 0%
浏览 2129

hive真是垃圾,处理从hbase映射的hive外表时因hbase数据量巨大总是跑崩。

hadoop集群三节点 centos7.5系统 4核 16G内存


hbase表大概有七千万条数据


hive建外表映射hbase表

create external table worked_data_o(
key String,province String,city String,code String,acc_number String,tel String,wd_date String,rep_disorder
String,overtime String,receipt String,descr String,exp_empid String,fault_1 String,fault_2 String,fault_type
String,acs_way String, address String)
STORED BY 'org.apache.hadoop.hive.hbase.HBaseStorageHandler'
WITH
SERDEPROPERTIES("hbase.columns.mapping"=":key,data:province,data:city,data:code,data:account,data:tel,data:
date,data:rep_disorder,data:overtime,data:receipt,data:emp_descr,data:emp_id,data:fault_name,data:fault_desc
r,data:fault_type,data:acs_way,data:address")
TBLPROPERTIES("hbase.table.name"="worked_data");

对hive外表进行数据处理放入新表

create table customer_intention as (
select wd.acc_number as acc_number,if(wd.descr='客户不满' OR wd.descr='客户不配合',2,IF(wd.descr='客户不听解释' OR wd.descr='客户情绪激动',4,IF(wd.descr='客户有投诉意向',6,IF(wd.descr='客户有强烈投诉意向',8,1)))) AS mood
From worked_data_o as wd);

报错:

WARNING: Hive-on-MR is deprecated in Hive 2 and may not be available in the future versions. Consider using a different execution engine (i.e. spark, tez) or using Hive 1.X releases.
Query ID = zkpk_20190506095541_4af46dd8-bf98-4a0e-b0b4-d7ed2974ab7b
Total jobs = 1
Launching Job 1 out of 1
Number of reduce tasks determined at compile time: 1
In order to change the average load for a reducer (in bytes):
  set hive.exec.reducers.bytes.per.reducer=<number>
In order to limit the maximum number of reducers:
  set hive.exec.reducers.max=<number>
In order to set a constant number of reducers:
  set mapreduce.job.reduces=<number>
Starting Job = job_1556796305355_0016, Tracking URL = http://master:8088/proxy/application_1556796305355_0016/
Kill Command = /home/zkpk/hadoop-2.7.2/bin/hadoop job  -kill job_1556796305355_0016
Hadoop job information for Stage-1: number of mappers: 8; number of reducers: 1
2019-05-06 09:55:54,098 Stage-1 map = 0%,  reduce = 0%
2019-05-06 09:56:54,965 Stage-1 map = 0%,  reduce = 0%, Cumulative CPU 149.84 sec
2019-05-06 09:57:55,903 Stage-1 map = 0%,  reduce = 0%, Cumulative CPU 212.38 sec
2019-05-06 09:58:56,456 Stage-1 map = 0%,  reduce = 0%, Cumulative CPU 266.3 sec
2019-05-06 09:59:57,079 Stage-1 map = 0%,  reduce = 0%, Cumulative CPU 266.3 sec
2019-05-06 10:00:57,760 Stage-1 map = 0%,  reduce = 0%, Cumulative CPU 266.3 sec
2019-05-06 10:01:58,363 Stage-1 map = 0%,  reduce = 0%, Cumulative CPU 266.3 sec
2019-05-06 10:02:58,965 Stage-1 map = 0%,  reduce = 0%, Cumulative CPU 266.3 sec
2019-05-06 10:03:59,598 Stage-1 map = 0%,  reduce = 0%, Cumulative CPU 266.3 sec
2019-05-06 10:05:00,223 Stage-1 map = 0%,  reduce = 0%, Cumulative CPU 266.3 sec
2019-05-06 10:06:00,843 Stage-1 map = 0%,  reduce = 0%, Cumulative CPU 266.3 sec
2019-05-06 10:07:01,478 Stage-1 map = 0%,  reduce = 0%, Cumulative CPU 266.3 sec
2019-05-06 10:08:02,204 Stage-1 map = 0%,  reduce = 0%, Cumulative CPU 266.3 sec
2019-05-06 10:09:02,871 Stage-1 map = 0%,  reduce = 0%
2019-05-06 10:10:02,902 Stage-1 map = 0%,  reduce = 0%
2019-05-06 10:11:03,408 Stage-1 map = 0%,  reduce = 0%
2019-05-06 10:12:04,032 Stage-1 map = 0%,  reduce = 0%
2019-05-06 10:13:04,624 Stage-1 map = 0%,  reduce = 0%
2019-05-06 10:14:05,247 Stage-1 map = 0%,  reduce = 0%
2019-05-06 10:15:05,844 Stage-1 map = 0%,  reduce = 0%
2019-05-06 10:16:06,468 Stage-1 map = 0%,  reduce = 0%
2019-05-06 10:17:07,159 Stage-1 map = 0%,  reduce = 0%
2019-05-06 10:18:07,762 Stage-1 map = 0%,  reduce = 0%
2019-05-06 10:19:08,330 Stage-1 map = 0%,  reduce = 0%
2019-05-06 10:20:08,556 Stage-1 map = 0%,  reduce = 0%
2019-05-06 10:21:09,202 Stage-1 map = 0%,  reduce = 0%
2019-05-06 10:22:09,765 Stage-1 map = 0%,  reduce = 0%
2019-05-06 10:23:10,416 Stage-1 map = 0%,  reduce = 0%
2019-05-06 10:24:10,989 Stage-1 map = 0%,  reduce = 0%
2019-05-06 10:25:11,574 Stage-1 map = 0%,  reduce = 0%
2019-05-06 10:26:12,113 Stage-1 map = 0%,  reduce = 0%
2019-05-06 10:27:12,720 Stage-1 map = 0%,  reduce = 0%
2019-05-06 10:28:13,333 Stage-1 map = 0%,  reduce = 0%
2019-05-06 10:29:13,919 Stage-1 map = 0%,  reduce = 0%
2019-05-06 10:30:14,194 Stage-1 map = 0%,  reduce = 0%
2019-05-06 10:31:14,707 Stage-1 map = 0%,  reduce = 0%
2019-05-06 10:32:15,303 Stage-1 map = 0%,  reduce = 0%
2019-05-06 10:33:15,933 Stage-1 map = 0%,  reduce = 0%
2019-05-06 10:34:16,503 Stage-1 map = 0%,  reduce = 0%
2019-05-06 10:35:17,051 Stage-1 map = 0%,  reduce = 0%
2019-05-06 10:36:17,718 Stage-1 map = 0%,  reduce = 0%
2019-05-06 10:37:18,231 Stage-1 map = 0%,  reduce = 0%
2019-05-06 10:38:18,866 Stage-1 map = 0%,  reduce = 0%
2019-05-06 10:39:19,420 Stage-1 map = 0%,  reduce = 0%
2019-05-06 10:40:19,987 Stage-1 map = 0%,  reduce = 0%
2019-05-06 10:40:23,058 Stage-1 map = 100%,  reduce = 100%
MapReduce Total cumulative CPU time: 4 minutes 26 seconds 300 msec
Ended Job = job_1556796305355_0016 with errors
Error during job, obtaining debugging information...
Examining task ID: task_1556796305355_0016_m_000006 (and more) from job job_1556796305355_0016
Examining task ID: task_1556796305355_0016_m_000002 (and more) from job job_1556796305355_0016
Examining task ID: task_1556796305355_0016_m_000000 (and more) from job job_1556796305355_0016

Task with the most failures(4): 
-----
Task ID:
  task_1556796305355_0016_m_000004

URL:
  http://master:8088/taskdetails.jsp?jobid=job_1556796305355_0016&tipid=task_1556796305355_0016_m_000004
-----
Diagnostic Messages for this Task:
AttemptID:attempt_1556796305355_0016_m_000004_3 Timed out after 600 secs

FAILED: Execution Error, return code 2 from org.apache.hadoop.hive.ql.exec.mr.MapRedTask
MapReduce Jobs Launched: 
Stage-Stage-1: Map: 8  Reduce: 1   Cumulative CPU: 266.3 sec   HDFS Read: 0 HDFS Write: 0 FAIL
Total MapReduce CPU Time Spent: 4 minutes 26 seconds 300 msec

大佬们求解决方法

  • 写回答

2条回答 默认 最新

  • qq_36466210 2020-03-11 21:39
    关注

    是mapreducen内存不合理。

    评论

报告相同问题?

悬赏问题

  • ¥15 想问一下stata17中这段代码哪里有问题呀
  • ¥15 flink cdc无法实时同步mysql数据
  • ¥100 有人会搭建GPT-J-6B框架吗?有偿
  • ¥15 求差集那个函数有问题,有无佬可以解决
  • ¥15 【提问】基于Invest的水源涵养
  • ¥20 微信网友居然可以通过vx号找到我绑的手机号
  • ¥15 寻一个支付宝扫码远程授权登录的软件助手app
  • ¥15 解riccati方程组
  • ¥15 使用rabbitMQ 消息队列作为url源进行多线程爬取时,总有几个url没有处理的问题。
  • ¥15 Ubuntu在安装序列比对软件STAR时出现报错如何解决