找一群哥们,一起搞事 2015-09-21 02:03 采纳率: 0%
浏览 5941
已结题

HMaster每天都自动挂掉,求大神指点

最近遇到一个比较头疼的问题,HBase每天都会自动挂掉一次,时间大概在5:30-5:45之间,做了几种尝试
1. 检查host配置。
2. 检查时钟同步。
3. 设置会话超时时间为60s

#####HMaster的出错日志如下:#####

 2015-09-21 05:32:20,463 INFO  [main-SendThread(132.37.5.197:29184)] zookeeper.ClientCnxn: Socket connection established to 132.37.5.197/132.37.5.197:29184, initiating session
2015-09-21 05:32:20,465 FATAL [main-EventThread] master.HMaster: Master server abort: loaded coprocessors are: []
2015-09-21 05:32:20,465 INFO  [main-SendThread(132.37.5.197:29184)] zookeeper.ClientCnxn: Unable to reconnect to ZooKeeper service, session 0x24f1f7bb79103a9 has expired, closing socket connection
2015-09-21 05:32:20,465 FATAL [main-EventThread] master.HMaster: master:60900-0x24f1f7bb79103a9, quorum=132.37.5.196:29184,132.37.5.195:29184,132.37.5.197:29184, baseZNode=/hbase master:60900-0x24f1f7bb79103a9 received expired from ZooKeeper, aborting
org.apache.zookeeper.KeeperException$SessionExpiredException: KeeperErrorCode = Session expired
        at org.apache.hadoop.hbase.zookeeper.ZooKeeperWatcher.connectionEvent(ZooKeeperWatcher.java:417)
        at org.apache.hadoop.hbase.zookeeper.ZooKeeperWatcher.process(ZooKeeperWatcher.java:328)
        at org.apache.zookeeper.ClientCnxn$EventThread.processEvent(ClientCnxn.java:522)
        at org.apache.zookeeper.ClientCnxn$EventThread.run(ClientCnxn.java:498)
2015-09-21 05:32:20,466 INFO  [main-EventThread] regionserver.HRegionServer: STOPPED: master:60900-0x24f1f7bb79103a9, quorum=132.37.5.196:29184,132.37.5.195:29184,132.37.5.197:29184, baseZNode=/hbase master:60900-0x24f1f7bb79103a9 received expired from ZooKeeper, aborting
2015-09-21 05:32:20,466 INFO  [main-EventThread] zookeeper.ClientCnxn: EventThread shut down
2015-09-21 05:32:20,466 INFO  [master/pkgtstdb2/132.37.5.194:60900] regionserver.HRegionServer: Stopping infoServer
2015-09-21 05:32:20,468 INFO  [master/pkgtstdb2/132.37.5.194:60900] mortbay.log: Stopped SelectChannelConnector@0.0.0.0:60910
2015-09-21 05:32:20,570 INFO  [master/pkgtstdb2/132.37.5.194:60900] regionserver.HRegionServer: stopping server pkgtstdb2,60900,1442548707194
2015-09-21 05:32:20,570 INFO  [master/pkgtstdb2/132.37.5.194:60900] client.ConnectionManager$HConnectionImplementation: Closing master protocol: MasterService
2015-09-21 05:32:20,570 INFO  [master/pkgtstdb2/132.37.5.194:60900] client.ConnectionManager$HConnectionImplementation: Closing zookeeper sessionid=0x24f1f7bb79103ad
2015-09-21 05:32:20,572 INFO  [master/pkgtstdb2/132.37.5.194:60900] zookeeper.ZooKeeper: Session: 0x24f1f7bb79103ad closed
2015-09-21 05:32:20,573 INFO  [master/pkgtstdb2/132.37.5.194:60900-EventThread] zookeeper.ClientCnxn: EventThread shut down
2015-09-21 05:32:20,573 INFO  [master/pkgtstdb2/132.37.5.194:60900] regionserver.HRegionServer: stopping server pkgtstdb2,60900,1442548707194; all regions closed.
2015-09-21 05:32:20,573 INFO  [CatalogJanitor-pkgtstdb2:60900] master.CatalogJanitor: CatalogJanitor-pkgtstdb2:60900 exiting
2015-09-21 05:32:20,574 WARN  [master/pkgtstdb2/132.37.5.194:60900] zookeeper.RecoverableZooKeeper: Possibly transient ZooKeeper, quorum=132.37.5.196:29184,132.37.5.195:29184,132.37.5.197:29184, exception=org.apache.zookeeper.KeeperException$SessionExpiredException: KeeperErrorCode = Session expired for /hbase/master
2015-09-21 05:32:20,574 INFO  [pkgtstdb2:60900.oldLogCleaner] cleaner.LogCleaner: pkgtstdb2:60900.oldLogCleaner exiting

HBase的配置文件如下:

<configuration>
<property>
<name>hbase.rootdir</name>
<value>hdfs://gxuweg3tst2:8920/wa</value>
</property>
<property>
<name>hbase.master.port</name>
<value>60900</value>
<description>The port the HBase Master should bind to.</description>
</property>
<property>
<name>hbase.cluster.distributed</name>
<value>true</value>
<description>The mode the cluster will be in. Possible values are
false for standalone mode and true for distributed mode. If
false, startup will run all HBase and ZooKeeper daemons together
in the one JVM.
</description>
</property>
<property>
<name>hbase.tmp.dir</name>
<!-- <value>/tmp/hbase-${user.name}</value> -->
<value>/uniiof/users/devdpp01/hbase/tmp</value>
<description>Temporary directory on the local filesystem.
Change this setting to point to a location more permanent
than '/tmp' (The '/tmp' directory is often cleared on
machine restart).
</description>
</property>
<property>
<name>hbase.master.info.port</name>
<value>60910</value>
<description>The port for the HBase Master web UI.
Set to -1 if you do not want a UI instance run.
</description>
</property>
<property>
<name>hbase.regionserver.port</name>
<value>60920</value>
<description>The port the HBase RegionServer binds to.
</description>
</property>
<property>
<name>hbase.regionserver.info.port</name>
<value>60930</value>
<description>The port for the HBase RegionServer web UI
Set to -1 if you do not want the RegionServer UI to run.
</description>
</property>
<!--
          The following three properties are used together to create the list of
               host:peer_port:leader_port quorum servers for ZooKeeper.
                    -->
<property>
<name>hbase.zookeeper.quorum</name>
<value>132.37.5.195,132.37.5.196,132.37.5.197</value>
<description>Comma separated list of servers in the ZooKeeper Quorum.
For example, "host1.mydomain.com,host2.mydomain.com,host3.mydomain.com".
By default this is set to localhost for local and pseudo-distributed modes
of operation. For a fully-distributed setup, this should be set to a full
list of ZooKeeper quorum servers. If HBASE_MANAGES_ZK is set in hbase-env.sh
this is the list of servers which we will start/stop ZooKeeper on.
</description>
</property>
<property>
<name>hbase.zookeeper.peerport</name>
<value>29888</value>
<description>Port used by ZooKeeper peers to talk to each other.
See
http://hadoop.apache.org/zookeeper/docs/r3.1.1/zookeeperStarted.html#sc_RunningReplicatedZoo
Keeper
for more information.
</description>
</property>
<property>
<name>hbase.zookeeper.leaderport</name>
<value>39888</value>
<description>Port used by ZooKeeper for leader election.
See
http://hadoop.apache.org/zookeeper/docs/r3.1.1/zookeeperStarted.html#sc_RunningReplicatedZoo
Keeper
for more information.
</description>
</property>
<!-- End of properties used to generate ZooKeeper host:port quorum list. -->
<property>
<name>hbase.zookeeper.property.clientPort</name>
<value>29184</value>
<description>Property from ZooKeeper's config zoo.cfg.
The port at which the clients will connect.
</description>
</property>
<!-- End of properties that are directly mapped from ZooKeeper's zoo.cfg -->
<property>
<name>hbase.rest.port</name>
<value>8980</value>
<description>The port for the HBase REST server.</description>
</property>
</configuration>
  • 写回答

3条回答

  • LongRui888 2015-09-25 07:24
    关注

    最关键的错误时:FATAL [main-EventThread] master.HMaster: Master server abort: loaded coprocessors are: []

    评论

报告相同问题?

悬赏问题

  • ¥15 HFSS 中的 H 场图与 MATLAB 中绘制的 B1 场 部分对应不上
  • ¥15 如何在scanpy上做差异基因和通路富集?
  • ¥20 关于#硬件工程#的问题,请各位专家解答!
  • ¥15 关于#matlab#的问题:期望的系统闭环传递函数为G(s)=wn^2/s^2+2¢wn+wn^2阻尼系数¢=0.707,使系统具有较小的超调量
  • ¥15 FLUENT如何实现在堆积颗粒的上表面加载高斯热源
  • ¥30 截图中的mathematics程序转换成matlab
  • ¥15 动力学代码报错,维度不匹配
  • ¥15 Power query添加列问题
  • ¥50 Kubernetes&Fission&Eleasticsearch
  • ¥15 報錯:Person is not mapped,如何解決?