我是一个master,3个slave
问题是这样的:
hadoop集群搭建好后在master机器上start-all.sh,结果datanode也在此机器启动没在slaves机器启动
我用的hadoop3.1+jdk1.8,之前照书上搭建的hadoop2.6+openjdk10可以搭建可以
正常启动,namenode等在master上启动,datanode等在slave上启动,现在换了新
版本就不行了,整了一天。。。
目前条件:各个机器能相互ping,也能ssh
都能正常上网
如果一台机器既做master又做slave,可以正常开启50070(当然hadoop3后改成了9870)网页,一切正常
在master上开启start-all.sh时:
WARNING: Attempting to start all Apache Hadoop daemons as hduser in 10 seconds.
WARNING: This is not a recommended production deployment configuration.
WARNING: Use CTRL-C to abort.
Starting namenodes on [emaster]
Starting datanodes
Starting secondary namenodes [emaster]
2018-05-04 22:39:37,858 WARN util.NativeCodeLoader: Unable to load native-hadoop library for your platform... using builtin-java classes where applicable
Starting resourcemanager
Starting nodemanagers
根本没有去找slave
jps后发现:
3233 SecondaryNameNode
3492 ResourceManager
2836 NameNode
3653 NodeManager
3973 Jps
3003 DataNode
全都在master上启动了,slave机器什么也没启动
查看datanode日志,发现它开了3次master的datanode(我的master名字是emaster):(展示部分)
STARTUP_MSG: Starting DataNode
STARTUP_MSG: host = emaster/192.168.56.100
STARTUP_MSG: args = []
STARTUP_MSG: version = 3.1.0
而且每遍有报错:
java.io.EOFException: End of File Exception between local host is: "emaster/192.168.56.100"; destination host is: "emaster":9000; : java.io.EOFException; For more details see: http://wiki.apache.org/hadoop/EOFException
at sun.reflect.NativeConstructorAccessorImpl.newInstance0(Native Method)
at sun.reflect.NativeConstructorAccessorImpl.newInstance(NativeConstructorAccessorImpl.java:62)
at sun.reflect.DelegatingConstructorAccessorImpl.newInstance(DelegatingConstructorAccessorImpl.java:45)
at java.lang.reflect.Constructor.newInstance(Constructor.java:408)
at org.apache.hadoop.net.NetUtils.wrapWithMessage(NetUtils.java:831)
at org.apache.hadoop.net.NetUtils.wrapException(NetUtils.java:789)
at org.apache.hadoop.ipc.Client.getRpcResponse(Client.java:1495)
at org.apache.hadoop.ipc.Client.call(Client.java:1437)
at org.apache.hadoop.ipc.Client.call(Client.java:1347)
at org.apache.hadoop.ipc.ProtobufRpcEngine$Invoker.invoke(ProtobufRpcEngine.java:228)
at org.apache.hadoop.ipc.ProtobufRpcEngine$Invoker.invoke(ProtobufRpcEngine.java:116)
at com.sun.proxy.$Proxy20.sendHeartbeat(Unknown Source)
at org.apache.hadoop.hdfs.protocolPB.DatanodeProtocolClientSideTranslatorPB.sendHeartbeat(DatanodeProtocolClientSideTranslatorPB.java:166)
at org.apache.hadoop.hdfs.server.datanode.BPServiceActor.sendHeartBeat(BPServiceActor.java:514)
at org.apache.hadoop.hdfs.server.datanode.BPServiceActor.offerService(BPServiceActor.java:645)
at org.apache.hadoop.hdfs.server.datanode.BPServiceActor.run(BPServiceActor.java:841)
at java.lang.Thread.run(Thread.java:745)
Caused by: java.io.EOFException
at java.io.DataInputStream.readInt(DataInputStream.java:392)
at org.apache.hadoop.ipc.Client$IpcStreams.readResponse(Client.java:1796)
at org.apache.hadoop.ipc.Client$Connection.receiveRpcResponse(Client.java:1165)
at org.apache.hadoop.ipc.Client$Connection.run(Client.java:1061)
2018-05-04 21:49:43,320 ERROR org.apache.hadoop.hdfs.server.datanode.DataNode: RECEIVED SIGNAL 15: SIGTERM
我的猜测,我设置了3台slave,却不知道哪里配置出了问题,使得master不去开启
slave的datanode,却开启自己的datanode,但我实在找不出哪里出错了,什么
masters文件,slaves文件都配了啊,而且各个机器间可以ping通,有大神可以指点下本小白吗,真的万分感谢!!!!