hdfs报错 RPC response exceeds maximum data length hdfs debug recoverlease_sed

  

  磁盘使用率关系到容量的均衡,在每个DN启动时都会启动一个线程,定期检测磁盘的使用情况,并向NN汇报,这个线程的名字就是refreshUsed,这个线程会调用操作系统命令du –sk来获得数据目录文件卷的使用情况,并将返回值保存起来。

在创建内部数据结构FSDataset的时候会创建这个线程

public FSDataset(DataStorage storage, Configuration conf) throws IOException{
  this.maxBlocksPerDir =conf.getInt("dfs.datanode.numblocks", 64);
  //获取系统配置中允许损坏的数据目录个数
  final int volFailuresTolerated =
   conf.getInt("dfs.datanode.failed.volumes.tolerated",
                0);
  //获得具体的数据目录
  String[] dataDirs =conf.getStrings(DataNode.DATA_DIR_KEY);
  int volsConfigured=0;
  //取得目录个数
  if(dataDirs != null)
     volsConfigured =dataDirs.length;
  //计算失败目录数
  int volsFailed =  volsConfigured - storage.getNumStorageDirs();
  //失败目录个数异常或大于阈值时,报错
  if( volsFailed < 0 ||
      volsFailed> volFailuresTolerated ) {
      throw newDiskErrorException("Invalid value for volsFailed : "
            + volsFailed + " , Volumes tolerated : " +volFailuresTolerated);
  }
  //期望得到的有效目录数
  this.validVolsRequired =  volsConfigured - volFailuresTolerated;
  //同样这个数字也有个数校验
  if (validVolsRequired < 1 ||
      validVolsRequired >storage.getNumStorageDirs()) {
    throw newDiskErrorException("Invalid value for validVolsRequired : "
              + validVolsRequired + " , Current validvolumes: " +          storage.getNumStorageDirs());
  }
  //创建文件卷数组
  FSVolume[] volArray = newFSVolume[storage.getNumStorageDirs()];
  //建立所有文件卷
  for (int idx = 0; idx <storage.getNumStorageDirs(); idx++) {
      //这里会建立所有的文件卷,并启动refreshUsed线程
    volArray[idx] = newFSVolume(storage.getStorageDir(idx).getCurrentDir(), conf);
  }
  //下面是异步块报告的信息,暂不关注
  volumes = newFSVolumeSet(volArray);
  volumes.getVolumeMap(volumeMap);
  asyncBlockReport = newAsyncBlockReport(this);
  asyncBlockReport.start();
  File[] roots = newFile[storage.getNumStorageDirs()];
  for (int idx = 0; idx <storage.getNumStorageDirs(); idx++) {
    roots[idx] =storage.getStorageDir(idx).getCurrentDir();
  }
  asyncDiskService = newFSDatasetAsyncDiskService(roots);
 registerMBean(storage.getStorageID());
}

比较核心的内容在创建FSVolume时,会处理detach目录的文件、临时目录、追加特性打开时的临时目录,最后会启动使用率监控线程

FSVolume(FilecurrentDir, Configuration conf) throws IOException {
      //获取系统配置的磁盘预留
      this.reserved = conf.getLong("dfs.datanode.du.reserved", 0);
      //创建目录树,这里会使用到递归
      this.dataDir = newFSDir(currentDir);
      this.currentDir = currentDir;
      //系统是否支持追加,关系到blocksBeingWritten的校验
      booleansupportAppends = conf.getBoolean("dfs.support.append", false);
      //获取父目录信息,因为在统计目录下有多个类型的目录需要校验
      File parent =currentDir.getParentFile();
      //deach文件的校验,如果存在则恢复到current目录中
      this.detachDir = new File(parent, "detach");
      if (detachDir.exists()) {
       recoverDetachedBlocks(currentDir, detachDir);
      }
      // remove all blocks from "tmp" directory.These were either created
      // by pre-append clients (0.18.x) or are part ofreplication request.
      // They can be safely removed.
      //临时目录的删除
      this.tmpDir = new File(parent, "tmp");
      if (tmpDir.exists()) {
        FileUtil.fullyDelete(tmpDir);
      }
      // 如果系统支持追加,则需要恢复该目录,否则全部删除
      blocksBeingWritten = new File(parent, "blocksBeingWritten");
      if (blocksBeingWritten.exists()) {
        if (supportAppends) { 
         recoverBlocksBeingWritten(blocksBeingWritten);
        } else {
          FileUtil.fullyDelete(blocksBeingWritten);
        }
      }
      //这些目录如果不存在,则全部创建,比如新增节点
      if (!blocksBeingWritten.mkdirs()) {
        if (!blocksBeingWritten.isDirectory()){
          throw new IOException("Mkdirs failed to create " + blocksBeingWritten.toString());
        }
      }
      if (!tmpDir.mkdirs()) {
        if (!tmpDir.isDirectory()) {
          throw new IOException("Mkdirs failed to create " + tmpDir.toString());
        }
      }
      if (!detachDir.mkdirs()) {
        if (!detachDir.isDirectory()){
          throw new IOException("Mkdirs failed to create " + detachDir.toString());
        }
      }
      this.usage = new DF(parent,conf);
      //创建监控磁盘使用率的类,在构建函数中就会扫描一遍,然后每隔10分钟扫描一次
      this.dfsUsage = new DU(parent,conf);
     //启动监控线程,后面会讲到
      this.dfsUsage.start();
    }

  下面看DU类的创建,该类继承自SHELL类,具有SHELL的一些特性,最终目的是通过该类执行操作系统的命令,可能Hadoop为了减少代码的繁琐程度,才直接调用SHELL命令的吧

public DU(File path,Configuration conf) throws IOException {
     //每10分钟检测一次
     this(path, 600000L);
  }
  public DU(File path,long interval) throws IOException {
    super(0);
    //we set the Shell interval to 0 so it will always run ourcommand
    //and use this one to set the thread sleep interval
    this.refreshInterval = interval;
    this.dirPath =path.getCanonicalPath();
    //先执行一次检测
    run();
  }

run函数就会做此次使用率的检测

/** check to see if a command needs to be executed and executeif needed */
  protected void run() throws IOException {
    if (lastTime + interval > System.currentTimeMillis())
      return;
    exitCode = 0; // reset for next run
    //开始执行命令了,那么命令从哪里取的?又是怎么执行的呢?
    runCommand();
  }

看下面函数,真正获得使用率的核心函数

private void runCommand() throws IOException {
   //构建PB用于执行SHELL命令,具体命令从子类DU获得du –sh xxxx
    ProcessBuilder builder =new ProcessBuilder(getExecString());
    Timer timeOutTimer = null;
    ShellTimeoutTimerTasktimeoutTimerTask = null;
    timedOut = newAtomicBoolean(false);
    completed = newAtomicBoolean(false);
    if (environment != null) {
     builder.environment().putAll(this.environment);
    }
    if (dir != null) {
      builder.directory(this.dir);
    }
    //开始执行命令
    process =builder.start();
    if (timeOutInterval > 0) {
      timeOutTimer = new Timer();
      timeoutTimerTask = newShellTimeoutTimerTask(
          this);
      //One time scheduling.
     timeOutTimer.schedule(timeoutTimerTask, timeOutInterval);
    }
    //获得输入流和错误流,在流里读取的内容用于报错或赋值
    finalBufferedReader errReader =
            new BufferedReader(newInputStreamReader(process
                                                    .getErrorStream()));
    BufferedReader inReader=
            newBufferedReader(new InputStreamReader(process
                                                     .getInputStream()));
    //错误信息缓存
    final StringBuffer errMsg = newStringBuffer();
    // 单独开启一个线程用于接收错误信息
    Thread errThread = new Thread() {
      @Override
      public void run() {
        try {
          String line =errReader.readLine();
          while((line != null) &&!isInterrupted()) {
           errMsg.append(line);
           errMsg.append(System.getProperty("line.separator"));
            line =errReader.readLine();
          }
        } catch(IOExceptionioe) {
          LOG.warn("Error reading the error stream", ioe);
        }
      }
    };
    try {
      errThread.start();
    } catch(IllegalStateException ise) { }
    try {
      //分析命令执行的返回结果,这里只会读取第一行
      parseExecResult(inReader); //parse the output
      // 清空输入流,把该流的内容全部读取出来
      String line =inReader.readLine();
      while(line != null) {
        line =inReader.readLine();
      }
      // 获得命令返回码
      exitCode  = process.waitFor();
      try {
        //合并线程
        errThread.join();
      } catch(InterruptedException ie) {
        LOG.warn("Interrupted while reading the error stream", ie);
      }
      completed.set(true);
      //后面是一些清理代码,这里就不贴上了
      ……………
  }



下面看下如果命令执行正确,是怎样赋值的?

protected void parseExecResult(BufferedReader lines) throws IOException {
   //读取第一行
    String line =lines.readLine();
    if (line == null) {
      throw new IOException("Expecting a line not the end of stream");
    }
    String[] tokens =line.split("\t");
    if(tokens.length == 0) {
      throw new IOException("Illegal du output");
    }
    //这里才真正开始赋值哦
    this.used.set(Long.parseLong(tokens[0])*1024);
}

在创建文件卷的最后会启动监控线程,看下这个流程

this .

dfsUsage .start()

public void start() {
    //only start the thread if the interval is sane
    if(refreshInterval > 0) {
      refreshUsed = new Thread(newDURefreshThread(), //真正的线程执行体在这里,下面会继续分析
          "refreshUsed-"+dirPath);
      refreshUsed.setDaemon(true);//设置为后台守护线程
      refreshUsed.start();//启动线程
    }
  }

至此,我们只要搞清DURefreshThread里的run方法就知道这个线程做哪些事情了

public void run() {
      while(shouldRun) {
        try {
          Thread.sleep(refreshInterval);//默认为10分钟哦
          try {
            //通过内部类调用外部类的方法实现目录容量的更新
            DU.this.run();
          } catch (IOExceptione) {
            synchronized (DU.this) {
              //save the latest exception so we can return it in getUsed()
              duException = e;
            }
            LOG.warn("Could not get disk usage information", e);
          }
        } catch(InterruptedException e) {
        }
      }
}

有图有真相

hdfs报错 RPC response exceeds maximum data length hdfs debug recoverlease_sed_02