海边微风落 2021-08-17 11:37 采纳率: 100%
浏览 566
已结题

hive使用Tez引擎失败

相关环境描述:
1.java版本:jdk1.8_212
2.hadoop版本:3.1.3
3.hive版本:3.1.2
4.tez版本:0.9.2

问题描述:
在hive中使用tez引擎报错,就连更换回mr也使用不了了,具体原因如下:

0: jdbc:hive2://node01:10000> select count(1) from emp;
INFO  : Compiling command(queryId=root_20210817113418_58b6d06e-46d9-47d6-a893-fca739db7ea0): select count(1) from emp
INFO  : Concurrency mode is disabled, not creating a lock manager
INFO  : Semantic Analysis Completed (retrial = false)
INFO  : Returning Hive schema: Schema(fieldSchemas:[FieldSchema(name:_c0, type:bigint, comment:null)], properties:null)
INFO  : Completed compiling command(queryId=root_20210817113418_58b6d06e-46d9-47d6-a893-fca739db7ea0); Time taken: 0.146 seconds
INFO  : Concurrency mode is disabled, not creating a lock manager
INFO  : Executing command(queryId=root_20210817113418_58b6d06e-46d9-47d6-a893-fca739db7ea0): select count(1) from emp
INFO  : Query ID = root_20210817113418_58b6d06e-46d9-47d6-a893-fca739db7ea0
INFO  : Total jobs = 1
INFO  : Launching Job 1 out of 1
INFO  : Starting task [Stage-1:MAPRED] in serial mode
WARN  : The session: sessionId=c7f2b41a-0426-40ae-993b-b3f17640520b, queueName=null, user=root, doAs=true, isOpen=false, isDefault=false has not been opened
INFO  : Subscribed to counters: [] for queryId: root_20210817113418_58b6d06e-46d9-47d6-a893-fca739db7ea0
INFO  : Tez session hasn't been created yet. Opening session
ERROR : Failed to execute tez graph.
org.apache.tez.dag.api.SessionNotRunning: TezSession has already shutdown. Application application_1629168275110_0004 failed 2 times due to AM Container for appattempt_1629168275110_0004_000002 exited with  exitCode: 1
Failing this attempt.Diagnostics: [2021-08-17 11:34:23.036]Exception from container-launch.
Container id: container_1629168275110_0004_02_000001
Exit code: 1

[2021-08-17 11:34:23.038]Container exited with a non-zero exit code 1. Error file: prelaunch.err.
Last 4096 bytes of prelaunch.err :
Last 4096 bytes of stderr :


[2021-08-17 11:34:23.039]Container exited with a non-zero exit code 1. Error file: prelaunch.err.
Last 4096 bytes of prelaunch.err :
Last 4096 bytes of stderr :


For more detailed output, check the application tracking page: http://node02:8088/cluster/app/application_1629168275110_0004 Then click on links to logs of each attempt.
. Failing the application.
    at org.apache.tez.client.TezClient.waitTillReady(TezClient.java:1013) ~[tez-api-0.9.2.jar:0.9.2]
    at org.apache.tez.client.TezClient.waitTillReady(TezClient.java:982) ~[tez-api-0.9.2.jar:0.9.2]
    at org.apache.hadoop.hive.ql.exec.tez.TezSessionState.startSessionAndContainers(TezSessionState.java:453) ~[hive-exec-3.1.2.jar:3.1.2]
    at org.apache.hadoop.hive.ql.exec.tez.TezSessionState.openInternal(TezSessionState.java:368) ~[hive-exec-3.1.2.jar:3.1.2]
    at org.apache.hadoop.hive.ql.exec.tez.TezSessionPoolSession.openInternal(TezSessionPoolSession.java:124) ~[hive-exec-3.1.2.jar:3.1.2]
    at org.apache.hadoop.hive.ql.exec.tez.TezSessionState.open(TezSessionState.java:245) ~[hive-exec-3.1.2.jar:3.1.2]
    at org.apache.hadoop.hive.ql.exec.tez.TezTask.ensureSessionHasResources(TezTask.java:368) ~[hive-exec-3.1.2.jar:3.1.2]
    at org.apache.hadoop.hive.ql.exec.tez.TezTask.execute(TezTask.java:195) ~[hive-exec-3.1.2.jar:3.1.2]
    at org.apache.hadoop.hive.ql.exec.Task.executeTask(Task.java:205) ~[hive-exec-3.1.2.jar:3.1.2]
    at org.apache.hadoop.hive.ql.exec.TaskRunner.runSequential(TaskRunner.java:97) ~[hive-exec-3.1.2.jar:3.1.2]
    at org.apache.hadoop.hive.ql.Driver.launchTask(Driver.java:2664) ~[hive-exec-3.1.2.jar:3.1.2]
    at org.apache.hadoop.hive.ql.Driver.execute(Driver.java:2335) ~[hive-exec-3.1.2.jar:3.1.2]
    at org.apache.hadoop.hive.ql.Driver.runInternal(Driver.java:2011) ~[hive-exec-3.1.2.jar:3.1.2]
    at org.apache.hadoop.hive.ql.Driver.run(Driver.java:1709) ~[hive-exec-3.1.2.jar:3.1.2]
    at org.apache.hadoop.hive.ql.Driver.run(Driver.java:1703) ~[hive-exec-3.1.2.jar:3.1.2]
    at org.apache.hadoop.hive.ql.reexec.ReExecDriver.run(ReExecDriver.java:157) ~[hive-exec-3.1.2.jar:3.1.2]
    at org.apache.hive.service.cli.operation.SQLOperation.runQuery(SQLOperation.java:224) ~[hive-service-3.1.2.jar:3.1.2]
    at org.apache.hive.service.cli.operation.SQLOperation.access$700(SQLOperation.java:87) ~[hive-service-3.1.2.jar:3.1.2]
    at org.apache.hive.service.cli.operation.SQLOperation$BackgroundWork$1.run(SQLOperation.java:316) ~[hive-service-3.1.2.jar:3.1.2]
    at java.security.AccessController.doPrivileged(Native Method) ~[?:1.8.0_212]
    at javax.security.auth.Subject.doAs(Subject.java:422) ~[?:1.8.0_212]
    at org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1729) ~[hadoop-common-3.1.3.jar:?]
    at org.apache.hive.service.cli.operation.SQLOperation$BackgroundWork.run(SQLOperation.java:329) ~[hive-service-3.1.2.jar:3.1.2]
    at java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:511) ~[?:1.8.0_212]
    at java.util.concurrent.FutureTask.run(FutureTask.java:266) ~[?:1.8.0_212]
    at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1149) ~[?:1.8.0_212]
    at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:624) ~[?:1.8.0_212]
    at java.lang.Thread.run(Thread.java:748) [?:1.8.0_212]
ERROR : FAILED: Execution Error, return code 1 from org.apache.hadoop.hive.ql.exec.tez.TezTask
INFO  : Completed executing command(queryId=root_20210817113418_58b6d06e-46d9-47d6-a893-fca739db7ea0); Time taken: 4.709 seconds
INFO  : Concurrency mode is disabled, not creating a lock manager
Error: Error while processing statement: FAILED: Execution Error, return code 1 from org.apache.hadoop.hive.ql.exec.tez.TezTask (state=08S01,code=1)


相关配置信息:
1.hadoop core-site

<configuration>
    <property>
        <name>fs.defaultFS</name>
        <value>hdfs://node01:8020</value>
    </property>
    <!-- 文件存储目录 -->
    <property>
        <name>hadoop.data.dir</name>
        <value>/opt/servers/hadoop-3.1.3/datas</value>
    </property>
    <!-- 临时文件存储目录 -->
    <property>
        <name>hadoop.tmp.dir</name>
        <value>/opt/servers/hadoop-3.1.3/datas/tmp</value>
    </property>
    <property>
            <name>hadoop.proxyuser.root.hosts</name>
            <value>*</value>
     </property>
    <property>
            <name>hadoop.proxyuser.root.groups</name>
            <value>*</value>
    </property>
        <property>
            <name>hadoop.http.staticuser.user</name>
            <value>root</value>
        </property>

</configuration>

2.hadoop capacity-schedular

<property>
    <name>yarn.scheduler.capacity.root.queues</name>
    <value>default,MyLine</value>
    <description>
      The queues at the this level (root is the root queue).
    </description>
  </property>

  <property>
    <name>yarn.scheduler.capacity.root.default.capacity</name>
    <value>40</value>
    <description>Default queue target capacity.</description>
  </property>

  <property>
    <name>yarn.scheduler.capacity.root.default.user-limit-factor</name>
    <value>1</value>
    <description>
      Default queue user limit a percentage from 0.0 to 1.0.
    </description>
  </property>

  <property>
    <name>yarn.scheduler.capacity.root.default.maximum-capacity</name>
    <value>60</value>
    <description>
      The maximum capacity of the default queue. 
    </description>
  </property>

  <property>
    <name>yarn.scheduler.capacity.root.default.state</name>
    <value>RUNNING</value>
    <description>
      The state of the default queue. State can be one of RUNNING or STOPPED.
    </description>
  </property>

  <property>
    <name>yarn.scheduler.capacity.root.default.acl_submit_applications</name>
    <value>*</value>
    <description>
      The ACL of who can submit jobs to the default queue.
    </description>
  </property>

  <property>
    <name>yarn.scheduler.capacity.root.default.acl_administer_queue</name>
    <value>*</value>
    <description>
      The ACL of who can administer jobs on the default queue.
    </description>
  </property>

  <property>
    <name>yarn.scheduler.capacity.root.default.acl_application_max_priority</name>
    <value>*</value>
    <description>
      The ACL of who can submit applications with configured priority.
      For e.g, [user={name} group={name} max_priority={priority} default_priority={priority}]
    </description>
  </property>

   <property>
     <name>yarn.scheduler.capacity.root.default.maximum-application-lifetime
     </name>
     <value>-1</value>
     <description>
        Maximum lifetime of an application which is submitted to a queue
        in seconds. Any value less than or equal to zero will be considered as
        disabled.
        This will be a hard time limit for all applications in this
        queue. If positive value is configured then any application submitted
        to this queue will be killed after exceeds the configured lifetime.
        User can also specify lifetime per application basis in
        application submission context. But user lifetime will be
        overridden if it exceeds queue maximum lifetime. It is point-in-time
        configuration.
        Note : Configuring too low value will result in killing application
        sooner. This feature is applicable only for leaf queue.
     </description>
   </property>

   <property>
     <name>yarn.scheduler.capacity.root.default.default-application-lifetime
     </name>
     <value>-1</value>
     <description>
        Default lifetime of an application which is submitted to a queue
        in seconds. Any value less than or equal to zero will be considered as
        disabled.
        If the user has not submitted application with lifetime value then this
        value will be taken. It is point-in-time configuration.
        Note : Default lifetime can't exceed maximum lifetime. This feature is
        applicable only for leaf queue.
     </description>
   </property>

    <property>
    <name>yarn.scheduler.capacity.root.MyLine.capacity</name>
    <value>60</value>
  </property>

  <property>
    <name>yarn.scheduler.capacity.root.MyLine.user-limit-factor</name>
    <value>1</value>
  </property>

  <property>
    <name>yarn.scheduler.capacity.root.MyLine.maximum-capacity</name>
    <value>80</value>
  </property>

  <property>
    <name>yarn.scheduler.capacity.root.MyLine.state</name>
    <value>RUNNING</value>
  </property>

  <property>
    <name>yarn.scheduler.capacity.root.MyLine.acl_submit_applications</name>
    <value>*</value>
  </property>

  <property>
    <name>yarn.scheduler.capacity.root.MyLine.acl_administer_queue</name>
    <value>*</value>
  </property>

  <property>
    <name>yarn.scheduler.capacity.root.MyLine.acl_application_max_priority</name>
    <value>*</value>
  </property>

   <property>
     <name>yarn.scheduler.capacity.root.MyLine.maximum-application-lifetime
     </name>
     <value>-1</value>
   </property>

   <property>
     <name>yarn.scheduler.capacity.root.MyLine.default-application-lifetime
     </name>
     <value>-1</value>
   </property>
    
  <property>
    <name>yarn.scheduler.capacity.node-locality-delay</name>
    <value>40</value>
    <description>
      Number of missed scheduling opportunities after which the CapacityScheduler 
      attempts to schedule rack-local containers.
      When setting this parameter, the size of the cluster should be taken into account.
      We use 40 as the default value, which is approximately the number of nodes in one rack.
      Note, if this value is -1, the locality constraint in the container request
      will be ignored, which disables the delay scheduling.
    </description>
  </property>

  <property>
    <name>yarn.scheduler.capacity.rack-locality-additional-delay</name>
    <value>-1</value>
    <description>
      Number of additional missed scheduling opportunities over the node-locality-delay
      ones, after which the CapacityScheduler attempts to schedule off-switch containers,
      instead of rack-local ones.
      Example: with node-locality-delay=40 and rack-locality-delay=20, the scheduler will
      attempt rack-local assignments after 40 missed opportunities, and off-switch assignments
      after 40+20=60 missed opportunities.
      When setting this parameter, the size of the cluster should be taken into account.
      We use -1 as the default value, which disables this feature. In this case, the number
      of missed opportunities for assigning off-switch containers is calculated based on
      the number of containers and unique locations specified in the resource request,
      as well as the size of the cluster.
    </description>
  </property>

  <property>
    <name>yarn.scheduler.capacity.queue-mappings</name>
    <value></value>
    <description>
      A list of mappings that will be used to assign jobs to queues
      The syntax for this list is [u|g]:[name]:[queue_name][,next mapping]*
      Typically this list will be used to map users to queues,
      for example, u:%user:%user maps all users to queues with the same name
      as the user.
    </description>
  </property>

  <property>
    <name>yarn.scheduler.capacity.queue-mappings-override.enable</name>
    <value>false</value>
    <description>
      If a queue mapping is present, will it override the value specified
      by the user? This can be used by administrators to place jobs in queues
      that are different than the one specified by the user.
      The default is false.
    </description>
  </property>

  <property>
    <name>yarn.scheduler.capacity.per-node-heartbeat.maximum-offswitch-assignments</name>
    <value>1</value>
    <description>
      Controls the number of OFF_SWITCH assignments allowed
      during a node's heartbeat. Increasing this value can improve
      scheduling rate for OFF_SWITCH containers. Lower values reduce
      "clumping" of applications on particular nodes. The default is 1.
      Legal values are 1-MAX_INT. This config is refreshable.
    </description>
  </property>


  <property>
    <name>yarn.scheduler.capacity.application.fail-fast</name>
    <value>false</value>
    <description>
      Whether RM should fail during recovery if previous applications'
      queue is no longer valid.
    </description>
  </property>

</configuration>


  1. hadoop hdfs-site
<configuration>
    <property>
        <name>dfs.namenode.name.dir</name>
        <value>file://${hadoop.data.dir}/namenode</value>
    </property>
    <!-- datanode数据的存放路径 -->
    <property>
        <name>dfs.datanode.data.dir</name>
        <value>file://${hadoop.data.dir}/datanode</value>
    </property>
    <property>
        <name>dfs.namenode.checkpoint.dir</name>
        <value>file://${hadoop.data.dir}/namesecondry</value>
    </property>
    <property>
        <name>dfs.namenode.secondary.http-address</name>
        <value>node02:9868</value>
    </property>
    <property>
        <name>dfs.client.datanode-restart.timeout</name>
        <value>30</value>
    </property>

</configuration>


4.hadoop mapred-site


```html
<configuration>
    <property>
        <name>mapreduce.framework.name</name>
        <value>yarn-tez</value>
    </property>

    <!-- 历史服务器端地址 -->
    <property>
            <name>mapreduce.jobhistory.address</name>
            <value>node01:10020</value>
    </property>
    <!-- 历史服务器web端地址 -->
    <property>
            <name>mapreduce.jobhistory.webapp.address</name>
            <value>node01:19888</value>
    </property>
    <property>
      <name>mapreduce.map.memory.mb</name>
      <value>2048</value>
    </property>
    <property>
      <name>mapreduce.map.java.opts</name>
      <value>-Xmx2048M</value>
    </property>
    <property>
      <name>mapreduce.reduce.memory.mb</name>
      <value>4096</value>
    </property>
    <property>
      <name>mapreduce.reduce.java.opts</name>
      <value>-Xmx4096M</value>
    </property>
    <property>
        <name>mapreduce.application.classpath</name>
        <value>$HADOOP_CONF_DIR,$HADOOP_COMMON_HOME/*, 
        $HADOOP_COMMON_HOME/lib/*, 
        $HADOOP_HDFS_HOME/*, 
        $HADOOP_HDFS_HOME/lib/*, 
        $HADOOP_MAPRED_HOME/*, 
        $HADOOP_MAPRED_HOME/lib/*, 
        $HADOOP_YARN_HOME/*, 
        $HADOOP_YARN_HOME/lib/*</value>
    </property>
</configuration>

  1. hadoop yarn -site
<configuration>
    <!-- 设置不检查虚拟内存的值,不然内存不够会报错 -->
    <property>
        <name>yarn.nodemanager.vmem-check-enabled</name>
        <value>false</value>
    </property> 
    <!--检查每个任务正使用的物理内存量,如果任务超出分配值,则直接将其杀掉 -->
    <property>
        <name>yarn.nodemanager.pmem-check-enabled</name>
        <value>false</value>
    </property> 
    
    <!-- yarn上面运行一个任务,最少需要1.5G内存,虚拟机没有这么大的内存就调小这个值,不然会报错 -->
    <property>
        <name>yarn.nodemanager.aux-services</name>
        <value>mapreduce_shuffle</value>
    </property>
    <property>
        <name>yarn.resourcemanager.hostname</name>
        <value>node02</value>
    </property>
    <property>
            <name>yarn.nodemanager.env-whitelist</name>
            <value>JAVA_HOME,HADOOP_COMMON_HOME,HADOOP_HDFS_HOME,HADOOP_CONF_DIR,CLASSPATH_PREPEND_DISTCACHE,HADOOP_YARN_HOME,HADOOP_MAPRED_HOME</value>
        </property>
    <!-- 开启日志聚集 -->
    <property>
        <name>yarn.log-aggregation-enable</name>
        <value>true</value>
    </property>
    <property>  
        <name>yarn.log.server.url</name>  
        <value>http:/node01:19888/jobhistory/logs</value>
    </property>
    <property>
        <name>yarn.log-aggregation.retain-seconds</name>
        <value>604800</value>
    </property>
    <property>
        <name>yarn.scheduler.fair.user-as-default-queue</name>
        <value>MyLine</value>
    </property>


</configuration>

  1. hive-site
<?xml version="1.0"?>
<?xml-stylesheet type="text/xsl" href="configuration.xsl"?>
<configuration>
    <property>
        <name>javax.jdo.option.ConnectionURL</name>
        <value>jdbc:mysql://node01:3306/metastore?useSSL=false</value>
    </property>

    <property>
        <name>javax.jdo.option.ConnectionDriverName</name>
        <value>com.mysql.jdbc.Driver</value>
    </property>

    <property>
        <name>javax.jdo.option.ConnectionUserName</name>
        <value>root</value>
    </property>

    <property>
        <name>javax.jdo.option.ConnectionPassword</name>
        <value>root</value>
    </property>

    <property>
        <name>hive.metastore.warehouse.dir</name>
        <value>/user/hive/warehouse</value>
    </property>

    <property>
        <name>hive.metastore.schema.verification</name>
        <value>false</value>
    </property>

    <property>
        <name>datanucleus.schema.autoCreateAll</name>
        <value>true</value> 
    </property>

    <property>
        <name>hive.metastore.uris</name>
        <value>thrift://node01:9083</value>
    </property>

    <property>
    <name>hive.server2.thrift.port</name>
    <value>10000</value>
    </property>

    <property>
        <name>hive.server2.thrift.bind.host</name>
        <value>node01</value>
    </property>

    <property>
        <name>hive.metastore.event.db.notification.api.auth</name>
        <value>false</value>
    </property>
    <property>
        <name>hive.execution.engine</name>
        <value>tez</value>
    </property>

</configuration>

  1. tez-site
<configuration>
<property>
    <name>tez.lib.uris</name>
    <value>${fs.defaultFS}/tez/tez.tar.gz</value>
</property>
<property>
     <name>tez.use.cluster.hadoop-libs</name>
     <value>false</value>
</property>
<property>
     <name>tez.history.logging.service.class</name>
     <value>org.apache.tez.dag.history.logging.ats.ATSHistoryLoggingService</value>
</property>

<property>
    <name>tez.am.resource.memory.mb</name>
    <value>1024</value>
</property>
<property>
    <name>tez.am.resource.cpu.vcores</name>
    <value>1</value>
</property>
<property>
    <name>tez.container.max.java.heap.fraction</name>
    <value>0.4</value>
</property>
<property>
    <name>tez.task.resource.memory.mb</name>
    <value>1024</value>
</property>
<property>
    <name>tez.task.resource.cpu.vcores</name>
    <value>1</value>
</property>
</configuration>


  1. hadoop hadoop-env
export TEZ_CONF_DIR=$HADOOP_HOME/etc/hadoop
export TEZ_JARS=/opt/servers/tez
export HADOOP_CLASSPATH=$HADOOP_CLASSPATH:${TEZ_CONF_DIR}:${TEZ_JARS}/*:${TEZ_JARS}/lib/*

9.环境变量相关设置

#JAVA_HOME
export JAVA_HOME=/opt/servers/jdk1.8.0_212
export PATH=$PATH:$JAVA_HOME/bin

##HADOOP_HOME
export HADOOP_HOME=/opt/servers/hadoop-3.1.3
export PATH=$PATH:$HADOOP_HOME/bin
export PATH=$PATH:$HADOOP_HOME/sbin

#ZOOKEEPER_HOME
export ZOOKEEPER_HOME=/opt/servers/zookeeper-3.5.7
export PATH=$PATH:$ZOOKEEPER_HOME/bin

#HIVE_HOME
export HIVE_HOME=/opt/servers/hive
export PATH=$PATH:$HIVE_HOME/bin
救救孩子吧,要被折磨疯了
  • 写回答

2条回答 默认 最新

  • CSDN专家-微编程 2021-08-17 23:11
    关注

    关闭所有进程,将配置文件里的所有的中文注释全部删除,记得全部删除,然后保存,完成后从新启动,再次尝试你的操作

    本回答被题主选为最佳回答 , 对您是否有帮助呢?
    评论
查看更多回答(1条)

报告相同问题?

问题事件

  • 系统已结题 8月27日
  • 已采纳回答 8月19日
  • 创建了问题 8月17日

悬赏问题

  • ¥15 IDEA构建失败?怎么搞
  • ¥15 求该题的simpson,牛顿科特斯matlab代码,越快越好
  • ¥30 求解,有偿,可商量价格
  • ¥15 编译arm板子的gcc
  • ¥15 C++代码报错问题,c++20协程
  • ¥15 c++图Djikstra算法求最短路径
  • ¥15 Linux操作系统中的,管道通信问题
  • ¥15 ansible tower 卡住
  • ¥15 等间距平面螺旋天线方程式
  • ¥15 通过链接访问,显示514或不是私密连接