weixin_45661471 2022-08-15 11:47 采纳率: 72.7%
浏览 78
已结题

使用hdfs上传文件报错org.apache.hadoop.hdfs.CannotObtainBlockLengthExceptio

问题遇到的现象和发生背景

使用hdfs上传文件报错,如果是空文件上传就会成功,如果是有内容的文件就会上传失败

运行结果及报错内容

org.apache.hadoop.hdfs.CannotObtainBlockLengthException: Cannot obtain block length for LocatedBlock{BP-293563276-192.168.0.11-1634095772355:blk_1073784416_43592; getBlockSize()=0; corrupt=false; offset=0; locs=[DatanodeInfoWithStorage[192.168.0.11:9866,DS-0f59c3c4-7aae-4baa-b7df-786930725c03,DISK]]}
at org.apache.hadoop.hdfs.DFSInputStream.readBlockLength(DFSInputStream.java:363) ~[hadoop-hdfs-client-3.2.1.jar!/:?]
at org.apache.hadoop.hdfs.DFSInputStream.fetchLocatedBlocksAndGetLastBlockLength(DFSInputStream.java:270) ~[hadoop-hdfs-client-3.2.1.jar!/:?]
at org.apache.hadoop.hdfs.DFSInputStream.openInfo(DFSInputStream.java:201) ~[hadoop-hdfs-client-3.2.1.jar!/:?]
at org.apache.hadoop.hdfs.DFSInputStream.(DFSInputStream.java:185) ~[hadoop-hdfs-client-3.2.1.jar!/:?]
at org.apache.hadoop.hdfs.DFSClient.openInternal(DFSClient.java:1048) ~[hadoop-hdfs-client-3.2.1.jar!/:?]
at org.apache.hadoop.hdfs.DFSClient.open(DFSClient.java:1011) ~[hadoop-hdfs-client-3.2.1.jar!/:?]
at org.apache.hadoop.hdfs.DistributedFileSystem$4.doCall(DistributedFileSystem.java:319) ~[hadoop-hdfs-client-3.2.1.jar!/:?]
at org.apache.hadoop.hdfs.DistributedFileSystem$4.doCall(DistributedFileSystem.java:315) ~[hadoop-hdfs-client-3.2.1.jar!/:?]
at org.apache.hadoop.fs.FileSystemLinkResolver.resolve(FileSystemLinkResolver.java:81) ~[hadoop-common-3.2.1.jar!/:?]
at org.apache.hadoop.hdfs.DistributedFileSystem.open(DistributedFileSystem.java:327) ~[hadoop-hdfs-client-3.2.1.jar!/:?]
at org.apache.hadoop.fs.FileSystem.open(FileSystem.java:899) ~[hadoop-common-3.2.1.jar!/:?]
at com.jf.common.oss.service.impl.HdfsOSSImpl.getObject(HdfsOSSImpl.java:68) [jfcommon-oss-2.1.0.jar!/:?]
at com.jf.common.oss.JfOSSClient.getObject(JfOSSClient.java:90) [jfcommon-oss-2.1.0.jar!/:?]
at com.jfh.oss.OSSClientProxy.getOSSObject(OSSClientProxy.java:98) [classes!/:?]
at com.jfh.util.MergePartHelper.mergePartFile(MergePartHelper.java:78) [classes!/:?]
at com.jfh.helper.ViewFileHelper.mergePartFile(ViewFileHelper.java:125) [classes!/:?]
at com.jfh.controller.JFZGFileController.mergePartFile(JFZGFileController.java:99) [classes!/:?]
at com.jfh.controller.JFZGFileController$$FastClassBySpringCGLIB$$2bf96aa4.invoke() [classes!/:?]
at org.springframework.cglib.proxy.MethodProxy.invoke(MethodProxy.java:218) [spring-core-5.1.15.RELEASE.jar!/:5.1.15.RELEASE]
at org.springframework.aop.framework.CglibAopProxy$CglibMethodInvocation.invokeJoinpoint(CglibAopProxy.java:752) [spring-aop-5.1.15.RELEASE.jar!/:5.1.15.RELEASE]
at org.springframework.aop.framework.ReflectiveMethodInvocation.proceed(ReflectiveMethodInvocation.java:163) [spring-aop-5.1.15.RELEASE.jar!/:5.1.15.RELEASE]

我的解答思路和尝试过的方法

使用 hdfs fsck -openforwrite这个命令得到的数据中没有openforwrite

img

  • 写回答

6条回答 默认 最新

  • wgyang_gz 2022-08-15 19:00
    关注

    信息不足不是很好判断,找了下源码,报错的方法如下:

    /** Read the block length from one of the datanodes. */
      private long readBlockLength(LocatedBlock locatedblock) throws IOException {
        assert locatedblock != null : "LocatedBlock cannot be null";
        int replicaNotFoundCount = locatedblock.getLocations().length;
    
        final DfsClientConf conf = dfsClient.getConf();
        final int timeout = conf.getSocketTimeout();
        LinkedList<DatanodeInfo> nodeList = new LinkedList<DatanodeInfo>(
            Arrays.asList(locatedblock.getLocations()));
        LinkedList<DatanodeInfo> retryList = new LinkedList<DatanodeInfo>();
        boolean isRetry = false;
        StopWatch sw = new StopWatch();
        while (nodeList.size() > 0) {
          DatanodeInfo datanode = nodeList.pop();
          ClientDatanodeProtocol cdp = null;
          try {
            cdp = DFSUtilClient.createClientDatanodeProtocolProxy(datanode,
                dfsClient.getConfiguration(), timeout,
                conf.isConnectToDnViaHostname(), locatedblock);
    
            final long n = cdp.getReplicaVisibleLength(locatedblock.getBlock());
    
            if (n >= 0) {
              return n;
            }
          } catch (IOException ioe) {
            checkInterrupted(ioe);
            if (ioe instanceof RemoteException) {
              if (((RemoteException) ioe).unwrapRemoteException() instanceof
                  ReplicaNotFoundException) {
                // replica is not on the DN. We will treat it as 0 length
                // if no one actually has a replica.
                replicaNotFoundCount--;
              } else if (((RemoteException) ioe).unwrapRemoteException() instanceof
                  RetriableException) {
                // add to the list to be retried if necessary.
                retryList.add(datanode);
              }
            }
            DFSClient.LOG.debug("Failed to getReplicaVisibleLength from datanode {}"
                  + " for block {}", datanode, locatedblock.getBlock(), ioe);
          } finally {
            if (cdp != null) {
              RPC.stopProxy(cdp);
            }
          }
    
          // Ran out of nodes, but there are retriable nodes.
          if (nodeList.size() == 0 && retryList.size() > 0) {
            nodeList.addAll(retryList);
            retryList.clear();
            isRetry = true;
          }
    
          if (isRetry) {
            // start the stop watch if not already running.
            if (!sw.isRunning()) {
              sw.start();
            }
            try {
              Thread.sleep(500); // delay between retries.
            } catch (InterruptedException e) {
              Thread.currentThread().interrupt();
              throw new InterruptedIOException(
                  "Interrupted while getting the length.");
            }
          }
    
          // see if we ran out of retry time
          if (sw.isRunning() && sw.now(TimeUnit.MILLISECONDS) > timeout) {
            break;
          }
        }
    
        // Namenode told us about these locations, but none know about the replica
        // means that we hit the race between pipeline creation start and end.
        // we require all 3 because some other exception could have happened
        // on a DN that has it.  we want to report that error
        if (replicaNotFoundCount == 0) {
          return 0;
        }
    
        throw new CannotObtainBlockLengthException(locatedblock);
      }
    

    从报错的代码中,只有两个地方能够返回,此处没有返回而抛了这个异常。而没有返回的原因是获取备份的可见长度n<0。代码如下:

    final long n = cdp.getReplicaVisibleLength(locatedblock.getBlock());
    
            if (n >= 0) {
              return n;
            }
    

    数据备份一般与存储有关,注意到你截图上面集群只有一个数据节点。不知道是不是该数据节点的存储已经用光了,可以检查了看看。
    信息不足,只能猜测到这些了。

    本回答被题主选为最佳回答 , 对您是否有帮助呢?
    评论
查看更多回答(5条)

报告相同问题?

问题事件

  • 系统已结题 8月24日
  • 已采纳回答 8月16日
  • 创建了问题 8月15日

悬赏问题

  • ¥15 FPGA-SRIO初始化失败
  • ¥15 MapReduce实现倒排索引失败
  • ¥15 ZABBIX6.0L连接数据库报错,如何解决?(操作系统-centos)
  • ¥15 找一位技术过硬的游戏pj程序员
  • ¥15 matlab生成电测深三层曲线模型代码
  • ¥50 随机森林与房贷信用风险模型
  • ¥50 buildozer打包kivy app失败
  • ¥30 在vs2022里运行python代码
  • ¥15 不同尺寸货物如何寻找合适的包装箱型谱
  • ¥15 求解 yolo算法问题