hadoop ha安装程序:无法连接到zookeeper

jei2mxaa  于 2021-05-30  发布在  Hadoop
关注(0)|答案(1)|浏览(493)

我正试图按照下面的文章设置hadoopha。
http://hashprompt.blogspot.in/2015/01/fully-distributed-hadoop-cluster.html
配置之后,当我尝试运行

hdfs zkfc -formatZK

我得到以下错误。

15/03/30 12:18:14 INFO zookeeper.ZooKeeper: Client     environment:java.library.path=/opt/hadoop-2.6.0/lib/native
    15/03/30 12:18:14 INFO zookeeper.ZooKeeper: Client environment:java.io.tmpdir=/tmp
15/03/30 12:18:14 INFO zookeeper.ZooKeeper: Client environment:java.compiler=<NA>
15/03/30 12:18:14 INFO zookeeper.ZooKeeper: Client environment:os.name=Linux
15/03/30 12:18:14 INFO zookeeper.ZooKeeper: Client environment:os.arch=amd64
15/03/30 12:18:14 INFO zookeeper.ZooKeeper: Client environment:os.version=3.13.0-32-generic
15/03/30 12:18:14 INFO zookeeper.ZooKeeper: Client environment:user.name=huser
15/03/30 12:18:14 INFO zookeeper.ZooKeeper: Client environment:user.home=/home/huser
15/03/30 12:18:14 INFO zookeeper.ZooKeeper: Client environment:user.dir=/opt/hadoop-2.6.0/sbin
15/03/30 12:18:14 INFO zookeeper.ZooKeeper: Initiating client connection, connectString=mo-4594ddc63.mo.sap.corp:2181,mo-6dd5bf8b8.mo.sap.corp:2181,mo-e7b2822cb.mo.sap.corp:2181 sessionTimeout=5000 watcher=org.apache.hadoop.ha.ActiveStandbyElector$WatcherWithClientRef@4d9e68d0
15/03/30 12:18:14 INFO zookeeper.ClientCnxn: Opening socket connection to server mo-4594ddc63.mo.sap.corp/10.97.155.65:2181. Will not attempt to authenticate using SASL (unknown error)
15/03/30 12:18:14 INFO zookeeper.ClientCnxn: Socket connection established to mo-4594ddc63.mo.sap.corp/10.97.155.65:2181, initiating session
15/03/30 12:18:14 INFO zookeeper.ClientCnxn: Unable to read additional data from server sessionid 0x0, likely server has closed socket, closing socket connection and attempting reconnect
15/03/30 12:18:15 INFO zookeeper.ClientCnxn: Opening socket connection to server mo-e7b2822cb.mo.sap.corp/10.97.136.84:2181. Will not attempt to authenticate using SASL (unknown error)
15/03/30 12:18:15 WARN zookeeper.ClientCnxn: Session 0x0 for server null, unexpected error, closing socket connection and attempting reconnect
java.net.ConnectException: Connection refused
        at sun.nio.ch.SocketChannelImpl.checkConnect(Native Method)
        at sun.nio.ch.SocketChannelImpl.finishConnect(SocketChannelImpl.java:717)
        at org.apache.zookeeper.ClientCnxnSocketNIO.doTransport(ClientCnxnSocketNIO.java:361)
        at org.apache.zookeeper.ClientCnxn$SendThread.run(ClientCnxn.java:1081)
15/03/30 12:18:15 INFO zookeeper.ClientCnxn: Opening socket connection to server mo-6dd5bf8b8.mo.sap.corp/10.97.156.12:2181. Will not attempt to authenticate using SASL (unknown error)
15/03/30 12:18:15 WARN zookeeper.ClientCnxn: Session 0x0 for server null, unexpected error, closing socket connection and attempting reconnect
java.net.ConnectException: Connection refused
        at sun.nio.ch.SocketChannelImpl.checkConnect(Native Method)
        at sun.nio.ch.SocketChannelImpl.finishConnect(SocketChannelImpl.java:717)
        at org.apache.zookeeper.ClientCnxnSocketNIO.doTransport(ClientCnxnSocketNIO.java:361)
        at org.apache.zookeeper.ClientCnxn$SendThread.run(ClientCnxn.java:1081)
15/03/30 12:18:17 INFO zookeeper.ClientCnxn: Opening socket connection to server mo-4594ddc63.mo.sap.corp/10.97.155.65:2181. Will not attempt to authenticate using SASL (unknown error)
15/03/30 12:18:17 INFO zookeeper.ClientCnxn: Socket connection established to mo-4594ddc63.mo.sap.corp/10.97.155.65:2181, initiating session
15/03/30 12:18:17 INFO zookeeper.ClientCnxn: Unable to read additional data from server sessionid 0x0, likely server has closed socket, closing socket connection and attempting reconnect
15/03/30 12:18:17 INFO zookeeper.ClientCnxn: Opening socket connection to server mo-e7b2822cb.mo.sap.corp/10.97.136.84:2181. Will not attempt to authenticate using SASL (unknown error)
15/03/30 12:18:17 WARN zookeeper.ClientCnxn: Session 0x0 for server null, unexpected error, closing socket connection and attempting reconnect
java.net.ConnectException: Connection refused
        at sun.nio.ch.SocketChannelImpl.checkConnect(Native Method)
        at sun.nio.ch.SocketChannelImpl.finishConnect(SocketChannelImpl.java:717)
        at org.apache.zookeeper.ClientCnxnSocketNIO.doTransport(ClientCnxnSocketNIO.java:361)
        at org.apache.zookeeper.ClientCnxn$SendThread.run(ClientCnxn.java:1081)
15/03/30 12:18:18 INFO zookeeper.ClientCnxn: Opening socket connection to server mo-6dd5bf8b8.mo.sap.corp/10.97.156.12:2181. Will not attempt to authenticate using SASL (unknown error)
15/03/30 12:18:18 WARN zookeeper.ClientCnxn: Session 0x0 for server null, unexpected error, closing socket connection and attempting reconnect
java.net.ConnectException: Connection refused
        at sun.nio.ch.SocketChannelImpl.checkConnect(Native Method)
        at sun.nio.ch.SocketChannelImpl.finishConnect(SocketChannelImpl.java:717)
        at org.apache.zookeeper.ClientCnxnSocketNIO.doTransport(ClientCnxnSocketNIO.java:361)
        at org.apache.zookeeper.ClientCnxn$SendThread.run(ClientCnxn.java:1081)
15/03/30 12:18:19 ERROR ha.ActiveStandbyElector: Connection timed out: couldn't connect to ZooKeeper in 5000 milliseconds
15/03/30 12:18:19 INFO zookeeper.ClientCnxn: Opening socket connection to server mo-4594ddc63.mo.sap.corp/10.97.155.65:2181. Will not attempt to authenticate using SASL (unknown error)
15/03/30 12:18:19 INFO zookeeper.ClientCnxn: Socket connection established to mo-4594ddc63.mo.sap.corp/10.97.155.65:2181, initiating session
15/03/30 12:18:20 INFO zookeeper.ZooKeeper: Session: 0x0 closed
15/03/30 12:18:20 INFO zookeeper.ClientCnxn: EventThread shut down
15/03/30 12:18:20 FATAL ha.ZKFailoverController: Unable to start failover controller. Unable to connect to ZooKeeper quorum at mo-4594ddc63.mo.sap.corp:2181,mo-6dd5bf8b8.mo.sap.corp:2181,mo-e7b2822cb.mo.sap.corp:2181. Please check the configured value for ha.zookeeper.quorum and ensure that ZooKeeper is running.

在zookeeper安装之后http://rajsyrus.blogspot.sg/2014/04/configuring-hadoop-high-availability.html),我用

./zkServer.sh start

但是当我看到它的状态时

./zkServer.sh status

结果如下

JMX enabled by default
Using config: /home/huser/zookeeper-3.4.6/bin/../conf/zoo.cfg
Error contacting service. It is probably not running.

这意味着它可能运行不正常。
zoo.cfg的内容


# do not use /tmp for storage, /tmp here is just

# example sakes.

dataDir=/home/huser/zookeeper/data/
dataLogDir=/home/huser/zookeeper/log/
server.1=mo-4594ddc63.mo.sap.corp:2888:3888
server.2=mo-6dd5bf8b8.mo.sap.corp:2888:3888
server.3=mo-e7b2822cb.mo.sap.corp:2888:3888

# the port at which the clients will connect

clientPort=2181

# the maximum number of client connections.

# increase this if you need to handle more clients

# maxClientCnxns=60

# 

# Be sure to read the maintenance section of the

# administrator guide before turning on autopurge.

# 

# http://zookeeper.apache.org/doc/current/zookeeperAdmin.html#sc_maintenance

# 

# The number of snapshots to retain in dataDir

# autopurge.snapRetainCount=3

# Purge task interval in hours

# Set to "0" to disable auto purge feature

# autopurge.purgeInterval=1

core-site.xml的内容

<configuration>
<property>
  <name>fs.default.name</name>
  <value>hdfs://auto-ha</value>
 </property>
 <property>
   <name>ha.zookeeper.quorum</name>
   <value>mo-4594ddc63.mo.sap.corp:2181,mo-6dd5bf8b8.mo.sap.corp:2181,mo-e7b2822cb.mo.sap.corp.hadoop.lab:2181</value>
 </property>
</configuration>

hdfs-site.xml的内容

<configuration>
<property>
  <name>dfs.replication</name>
  <value>2</value>
 </property>
 <property>
  <name>dfs.name.dir</name>
  <value>file:///hdfs/name</value>
 </property>
 <property>
  <name>dfs.data.dir</name>
  <value>file:///hdfs/data</value>
 </property>
 <property>
  <name>dfs.permissions</name>
  <value>false</value>
 </property>
 <property>
  <name>dfs.nameservices</name>
  <value>auto-ha</value>
 </property>
 <property>
  <name>dfs.ha.namenodes.auto-ha</name>
  <value>nn01,nn02</value>
 </property>
 <property>
  <name>dfs.namenode.rpc-address.auto-ha.nn01</name>
  <value>mo-4594ddc63.mo.sap.corp:8020</value>
 </property>
 <property>
  <name>dfs.namenode.http-address.auto-ha.nn01</name>
  <value>mo-4594ddc63.mo.sap.corp:50070</value>
 </property>
 <property>
  <name>dfs.namenode.rpc-address.auto-ha.nn02</name>
  <value>mo-6dd5bf8b8.mo.sap.corp:8020</value>
 </property>
 <property>
  <name>dfs.namenode.http-address.auto-ha.nn02</name>
  <value>mo-6dd5bf8b8.mo.sap.corp:50070</value>
 </property>
 <property>
  <name>dfs.namenode.shared.edits.dir</name>
  <value>qjournal://mo-4594ddc63.mo.sap.corp:8485;mo-6dd5bf8b8.mo.sap.corp:8485;mo-e7b2822cb.mo.sap.corp:8485/auto-ha</value>
 </property>
 <property>
  <name>dfs.journalnode.edits.dir</name>
  <value>/hdfs/journalnode</value>
 </property>
 <property>
  <name>dfs.ha.fencing.methods</name>
  <value>sshfence</value>
 </property>
 <property>
  <name>dfs.ha.fencing.ssh.private-key-files</name>
  <value>/home/huser/.ssh/id_rsa</value>
 </property>
 <property>
  <name>dfs.ha.automatic-failover.enabled.auto-ha</name>
  <value>true</value>
 </property>
 <property>
   <name>ha.zookeeper.quorum</name>
   <value>mo-4594ddc63.mo.sap.corp:2181,mo-6dd5bf8b8.mo.sap.corp:2181,mo-e7b2822cb.mo.sap.corp:2181</value>
 </property>
<property>
 <name>dfs.client.failover.proxy.provider.auto-ha</name>
 <value>org.apache.hadoop.hdfs.server.namenode.ha.ConfiguredFailoverProxyProvider</value>
</property>
</configuration>

任何指向错误解决方法的指针都会有很大帮助。
你好,subhankar
编辑
在做了拉杰什在回答中提到的事情之后,它似乎起了作用,因为没有任何错误。但是,在安装之后,运行pi示例会显示以下错误。

huser@mo-4594ddc63:~$ hadoop jar /opt/hadoop-2.6.0/share/hadoop/mapreduce/hadoop-mapreduce-examples-2.6.0.jar pi 8 10000
Number of Maps  = 8
Samples per Map = 10000
15/03/31 13:23:08 WARN hdfs.DFSClient: DataStreamer Exception
org.apache.hadoop.ipc.RemoteException(java.io.IOException): File /user/huser/QuasiMonteCarlo_1427808186022_1353266286/in/part0 could only be replicated to 0 nodes instead of minReplication (=1).  There are 0 datanode(s) running and no node(s) are excluded in this operation.
        at org.apache.hadoop.hdfs.server.blockmanagement.BlockManager.chooseTarget4NewBlock(BlockManager.java:1549)
        at org.apache.hadoop.hdfs.server.namenode.FSNamesystem.getAdditionalBlock(FSNamesystem.java:3200)
        at org.apache.hadoop.hdfs.server.namenode.NameNodeRpcServer.addBlock(NameNodeRpcServer.java:641)
        at org.apache.hadoop.hdfs.protocolPB.ClientNamenodeProtocolServerSideTranslatorPB.addBlock(ClientNamenodeProtocolServerSideTranslatorPB.java:482)
        at org.apache.hadoop.hdfs.protocol.proto.ClientNamenodeProtocolProtos$ClientNamenodeProtocol$2.callBlockingMethod(ClientNamenodeProtocolProtos.java)
        at org.apache.hadoop.ipc.ProtobufRpcEngine$Server$ProtoBufRpcInvoker.call(ProtobufRpcEngine.java:619)
        at org.apache.hadoop.ipc.RPC$Server.call(RPC.java:962)
        at org.apache.hadoop.ipc.Server$Handler$1.run(Server.java:2039)
        at org.apache.hadoop.ipc.Server$Handler$1.run(Server.java:2035)
        at java.security.AccessController.doPrivileged(Native Method)
        at javax.security.auth.Subject.doAs(Subject.java:422)
        at org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1628)
        at org.apache.hadoop.ipc.Server$Handler.run(Server.java:2033)

        at org.apache.hadoop.ipc.Client.call(Client.java:1468)
        at org.apache.hadoop.ipc.Client.call(Client.java:1399)
        at org.apache.hadoop.ipc.ProtobufRpcEngine$Invoker.invoke(ProtobufRpcEngine.java:232)
        at com.sun.proxy.$Proxy9.addBlock(Unknown Source)
        at org.apache.hadoop.hdfs.protocolPB.ClientNamenodeProtocolTranslatorPB.addBlock(ClientNamenodeProtocolTranslatorPB.java:399)
        at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
        at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62)
        at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
        at java.lang.reflect.Method.invoke(Method.java:497)
        at org.apache.hadoop.io.retry.RetryInvocationHandler.invokeMethod(RetryInvocationHandler.java:187)
        at org.apache.hadoop.io.retry.RetryInvocationHandler.invoke(RetryInvocationHandler.java:102)
        at com.sun.proxy.$Proxy10.addBlock(Unknown Source)
        at org.apache.hadoop.hdfs.DFSOutputStream$DataStreamer.locateFollowingBlock(DFSOutputStream.java:1532)
        at org.apache.hadoop.hdfs.DFSOutputStream$DataStreamer.nextBlockOutputStream(DFSOutputStream.java:1349)
        at org.apache.hadoop.hdfs.DFSOutputStream$DataStreamer.run(DFSOutputStream.java:588)
org.apache.hadoop.ipc.RemoteException(java.io.IOException): File /user/huser/QuasiMonteCarlo_1427808186022_1353266286/in/part0 could only be replicated to 0 nodes instead of minReplication (=1).  There are 0 datanode(s) running and no node(s) are excluded in this operation.
        at org.apache.hadoop.hdfs.server.blockmanagement.BlockManager.chooseTarget4NewBlock(BlockManager.java:1549)
        at org.apache.hadoop.hdfs.server.namenode.FSNamesystem.getAdditionalBlock(FSNamesystem.java:3200)
        at org.apache.hadoop.hdfs.server.namenode.NameNodeRpcServer.addBlock(NameNodeRpcServer.java:641)
        at org.apache.hadoop.hdfs.protocolPB.ClientNamenodeProtocolServerSideTranslatorPB.addBlock(ClientNamenodeProtocolServerSideTranslatorPB.java:482)
        at org.apache.hadoop.hdfs.protocol.proto.ClientNamenodeProtocolProtos$ClientNamenodeProtocol$2.callBlockingMethod(ClientNamenodeProtocolProtos.java)
        at org.apache.hadoop.ipc.ProtobufRpcEngine$Server$ProtoBufRpcInvoker.call(ProtobufRpcEngine.java:619)
        at org.apache.hadoop.ipc.RPC$Server.call(RPC.java:962)
        at org.apache.hadoop.ipc.Server$Handler$1.run(Server.java:2039)
        at org.apache.hadoop.ipc.Server$Handler$1.run(Server.java:2035)
        at java.security.AccessController.doPrivileged(Native Method)
        at javax.security.auth.Subject.doAs(Subject.java:422)
        at org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1628)
        at org.apache.hadoop.ipc.Server$Handler.run(Server.java:2033)

        at org.apache.hadoop.ipc.Client.call(Client.java:1468)
        at org.apache.hadoop.ipc.Client.call(Client.java:1399)
        at org.apache.hadoop.ipc.ProtobufRpcEngine$Invoker.invoke(ProtobufRpcEngine.java:232)
        at com.sun.proxy.$Proxy9.addBlock(Unknown Source)
        at org.apache.hadoop.hdfs.protocolPB.ClientNamenodeProtocolTranslatorPB.addBlock(ClientNamenodeProtocolTranslatorPB.java:399)
        at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
        at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62)
        at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
        at java.lang.reflect.Method.invoke(Method.java:497)
        at org.apache.hadoop.io.retry.RetryInvocationHandler.invokeMethod(RetryInvocationHandler.java:187)
        at org.apache.hadoop.io.retry.RetryInvocationHandler.invoke(RetryInvocationHandler.java:102)
        at com.sun.proxy.$Proxy10.addBlock(Unknown Source)
        at org.apache.hadoop.hdfs.DFSOutputStream$DataStreamer.locateFollowingBlock(DFSOutputStream.java:1532)
        at org.apache.hadoop.hdfs.DFSOutputStream$DataStreamer.nextBlockOutputStream(DFSOutputStream.java:1349)
        at org.apache.hadoop.hdfs.DFSOutputStream$DataStreamer.run(DFSOutputStream.java:588)
15/03/31 13:23:08 ERROR hdfs.DFSClient: Failed to close inode 16390
org.apache.hadoop.ipc.RemoteException(java.io.IOException): File /user/huser/QuasiMonteCarlo_1427808186022_1353266286/in/part0 could only be replicated to 0 nodes instead of minReplication (=1).  There are 0 datanode(s) running and no node(s) are excluded in this operation.
        at org.apache.hadoop.hdfs.server.blockmanagement.BlockManager.chooseTarget4NewBlock(BlockManager.java:1549)
        at org.apache.hadoop.hdfs.server.namenode.FSNamesystem.getAdditionalBlock(FSNamesystem.java:3200)
        at org.apache.hadoop.hdfs.server.namenode.NameNodeRpcServer.addBlock(NameNodeRpcServer.java:641)
        at org.apache.hadoop.hdfs.protocolPB.ClientNamenodeProtocolServerSideTranslatorPB.addBlock(ClientNamenodeProtocolServerSideTranslatorPB.java:482)
        at org.apache.hadoop.hdfs.protocol.proto.ClientNamenodeProtocolProtos$ClientNamenodeProtocol$2.callBlockingMethod(ClientNamenodeProtocolProtos.java)
        at org.apache.hadoop.ipc.ProtobufRpcEngine$Server$ProtoBufRpcInvoker.call(ProtobufRpcEngine.java:619)
        at org.apache.hadoop.ipc.RPC$Server.call(RPC.java:962)
        at org.apache.hadoop.ipc.Server$Handler$1.run(Server.java:2039)
        at org.apache.hadoop.ipc.Server$Handler$1.run(Server.java:2035)
        at java.security.AccessController.doPrivileged(Native Method)
        at javax.security.auth.Subject.doAs(Subject.java:422)
        at org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1628)
        at org.apache.hadoop.ipc.Server$Handler.run(Server.java:2033)

        at org.apache.hadoop.ipc.Client.call(Client.java:1468)
        at org.apache.hadoop.ipc.Client.call(Client.java:1399)
        at org.apache.hadoop.ipc.ProtobufRpcEngine$Invoker.invoke(ProtobufRpcEngine.java:232)
        at com.sun.proxy.$Proxy9.addBlock(Unknown Source)
        at org.apache.hadoop.hdfs.protocolPB.ClientNamenodeProtocolTranslatorPB.addBlock(ClientNamenodeProtocolTranslatorPB.java:399)
        at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
        at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62)
        at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
        at java.lang.reflect.Method.invoke(Method.java:497)
        at org.apache.hadoop.io.retry.RetryInvocationHandler.invokeMethod(RetryInvocationHandler.java:187)
        at org.apache.hadoop.io.retry.RetryInvocationHandler.invoke(RetryInvocationHandler.java:102)
        at com.sun.proxy.$Proxy10.addBlock(Unknown Source)
        at org.apache.hadoop.hdfs.DFSOutputStream$DataStreamer.locateFollowingBlock(DFSOutputStream.java:1532)
        at org.apache.hadoop.hdfs.DFSOutputStream$DataStreamer.nextBlockOutputStream(DFSOutputStream.java:1349)
        at org.apache.hadoop.hdfs.DFSOutputStream$DataStreamer.run(DFSOutputStream.java:588)

看起来数据节点没有运行!!任何关于什么可能是错误的指针!
编辑2
在几次重试之后,我停止了所有操作并重新启动了所有节点。但现在看来namenode02还没有开始。当我运行命令hdfs haadmin-getservicestate nn02时,我得到一个错误操作失败:从mo-4594ddc63/10.97.155.65调用mo-6dd5bf8b8失败,连接异常:java.net.connectexception:连接被拒绝;有关更多详细信息,请参见:wiki.apache.org/hadoop/connectiondensed namenode02中未连接的日志。

2015-03-30 12:58:04,837 INFO org.apache.hadoop.ipc.Server: IPC Server handler 5 on 8020, call org.apache.hadoop.hdfs.server.protocol.NamenodeProtocol.rollEditLog from 10.97.155.65:60502 Call#229 Retry#0: org.apache.hadoop.ipc.StandbyException: Operation category JOURNAL is not supported in state standby
2015-03-30 12:58:52,094 INFO org.apache.hadoop.hdfs.server.namenode.ha.EditLogTailer: Triggering log roll on remote NameNode mo-4594ddc63.mo.sap.corp/10.97.155.65:8020
2015-03-30 12:58:52,103 WARN org.apache.hadoop.hdfs.server.namenode.ha.EditLogTailer: Unable to trigger a roll of the active NN
org.apache.hadoop.ipc.RemoteException(org.apache.hadoop.ipc.StandbyException): Operation category JOURNAL is not supported in state standby
        at org.apache.hadoop.hdfs.server.namenode.ha.StandbyState.checkOperation(StandbyState.java:87)
        at org.apache.hadoop.hdfs.server.namenode.NameNode$NameNodeHAContext.checkOperation(NameNode.java:1719)
        at org.apache.hadoop.hdfs.server.namenode.FSNamesystem.checkOperation(FSNamesystem.java:1350)
        at org.apache.hadoop.hdfs.server.namenode.FSNamesystem.rollEditLog(FSNamesystem.java:6336)
        at org.apache.hadoop.hdfs.server.namenode.NameNodeRpcServer.rollEditLog(NameNodeRpcServer.java:933)
        at org.apache.hadoop.hdfs.protocolPB.NamenodeProtocolServerSideTranslatorPB.rollEditLog(NamenodeProtocolServerSideTranslatorPB.java:139)
        at org.apache.hadoop.hdfs.protocol.proto.NamenodeProtocolProtos$NamenodeProtocolService$2.callBlockingMethod(NamenodeProtocolProtos.java:11214)
        at org.apache.hadoop.ipc.ProtobufRpcEngine$Server$ProtoBufRpcInvoker.call(ProtobufRpcEngine.java:619)
        at org.apache.hadoop.ipc.RPC$Server.call(RPC.java:962)
        at org.apache.hadoop.ipc.Server$Handler$1.run(Server.java:2039)
        at org.apache.hadoop.ipc.Server$Handler$1.run(Server.java:2035)
        at java.security.AccessController.doPrivileged(Native Method)
        at javax.security.auth.Subject.doAs(Subject.java:422)
        at org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1628)
        at org.apache.hadoop.ipc.Server$Handler.run(Server.java:2033)

        at org.apache.hadoop.ipc.Client.call(Client.java:1468)
        at org.apache.hadoop.ipc.Client.call(Client.java:1399)
        at org.apache.hadoop.ipc.ProtobufRpcEngine$Invoker.invoke(ProtobufRpcEngine.java:232)
        at com.sun.proxy.$Proxy15.rollEditLog(Unknown Source)
        at org.apache.hadoop.hdfs.protocolPB.NamenodeProtocolTranslatorPB.rollEditLog(NamenodeProtocolTranslatorPB.java:145)
        at org.apache.hadoop.hdfs.server.namenode.ha.EditLogTailer.triggerActiveLogRoll(EditLogTailer.java:271)

在datanode中,我找到了这些日志

java.io.EOFException: End of File Exception between local host is: "mo-217e677f3.mo.sap.corp/10.97.168.28"; destination host is: "mo-4594ddc63.mo.sap.corp":8020; : java.io.EOFException; For more details see:  http://wiki.apache.org/hadoop/EOFException
        at sun.reflect.NativeConstructorAccessorImpl.newInstance0(Native Method)
        at sun.reflect.NativeConstructorAccessorImpl.newInstance(NativeConstructorAccessorImpl.java:62)
        at sun.reflect.DelegatingConstructorAccessorImpl.newInstance(DelegatingConstructorAccessorImpl.java:45)
        at java.lang.reflect.Constructor.newInstance(Constructor.java:422)
        at org.apache.hadoop.net.NetUtils.wrapWithMessage(NetUtils.java:791)
        at org.apache.hadoop.net.NetUtils.wrapException(NetUtils.java:764)
        at org.apache.hadoop.ipc.Client.call(Client.java:1472)
        at org.apache.hadoop.ipc.Client.call(Client.java:1399)
        at org.apache.hadoop.ipc.ProtobufRpcEngine$Invoker.invoke(ProtobufRpcEngine.java:232)
        at com.sun.proxy.$Proxy12.sendHeartbeat(Unknown Source)
        at org.apache.hadoop.hdfs.protocolPB.DatanodeProtocolClientSideTranslatorPB.sendHeartbeat(DatanodeProtocolClientSideTranslatorPB.java:139)
        at org.apache.hadoop.hdfs.server.datanode.BPServiceActor.sendHeartBeat(BPServiceActor.java:582)
        at org.apache.hadoop.hdfs.server.datanode.BPServiceActor.offerService(BPServiceActor.java:680)
        at org.apache.hadoop.hdfs.server.datanode.BPServiceActor.run(BPServiceActor.java:850)
        at java.lang.Thread.run(Thread.java:745)
Caused by: java.io.EOFException
        at java.io.DataInputStream.readInt(DataInputStream.java:392)
        at org.apache.hadoop.ipc.Client$Connection.receiveRpcResponse(Client.java:1071)
        at org.apache.hadoop.ipc.Client$Connection.run(Client.java:966)

/每个节点上的etc/hosts文件

10.97.156.12    localhost
10.97.156.12    mo-6dd5bf8b8.mo.sap.corp mo-6dd5bf8b8
10.97.155.65  mo-4594ddc63.mo.sap.corp

# 10.97.156.12  mo-6dd5bf8b8.mo.sap.corp

10.97.136.84  mo-e7b2822cb.mo.sap.corp
10.97.168.28  mo-217e677f3.mo.sap.corp
10.97.157.82  mo-fd6fa7b57.mo.sap.corp

ff02::1 ip6-allnodes
ff02::2 ip6-allrouters
::1     ip6-localhost ip6-loopback
fe00::  ip6-localnet
ff00::  ip6-mcastprefix

每个节点的操作系统:ubuntu 12.04

kokeuurv

kokeuurv1#

在zoo.cfg中更改:

server.1=mo-4594ddc63.mo.sap.corp:2888:3888
server.2=mo-6dd5bf8b8.mo.sap.corp:2888:3888
server.3=mo-e7b2822cb.mo.sap.corp:2888:3888

server.1=mo-4594ddc63.mo.sap.corp:2888:3888
server.2=mo-6dd5bf8b8.mo.sap.corp:2889:3889
server.3=mo-e7b2822cb.mo.sap.corp:2890:3890

现在启动zookeeper并检查状态。

相关问题