HBase中系统故障恢复以及主从复制都基于HLog实现。默认情况下,所有写入操作(写入、更新以及删除)的数据都先以追加形式写入HLog,再写入MemStore。大多数情况下,HLog并不会被读取,但如果RegionServer在某些异常情况下发生宕机,此时已经写入MemStore中但尚未flush到磁盘的数据就会丢失,需要回放HLog补救丢失的数据。此外,HBase主从复制需要主集群将HLog日志发送给从集群,从集群在本地执行回放操作,完成集群之间的数据复制。

1、HLog文件结构

HLog文件的基本结构如图所示。

binlog写入hbase hbase hlog解析_hbase

---每个RegionServer拥有一个或多个HLog。每个HLog是多个Region共享的,图中Region A、Region B和Region C共享一个HLog文件。

---HLog中,日志单元WALEntry(图中小方框)表示一次行级更新的最小追加单元,它由HLogKey和WALEdit两部分组成。

(1)、HLogKey

由table name、region name以及sequenceid等字段构成。

WALKeyImpl:

@InterfaceAudience.Private

protected void init(final byte[] encodedRegionName,
                      final TableName tablename,
                      long logSeqNum,
                      final long now,
                      List<UUID> clusterIds,
                      long nonceGroup,
                      long nonce,
                      MultiVersionConcurrencyControl mvcc,
                      NavigableMap<byte[], Integer> replicationScope,
                      Map<String, byte[]> extendedAttributes) {
    this.sequenceId = logSeqNum;
    this.writeTime = now;
    this.clusterIds = clusterIds;
    this.encodedRegionName = encodedRegionName;
    this.tablename = tablename;
    this.nonceGroup = nonceGroup;
    this.nonce = nonce;
    this.mvcc = mvcc;
    if (logSeqNum != NO_SEQUENCE_ID) {
      setSequenceId(logSeqNum);
    }
    this.replicationScope = replicationScope;
    this.extendedAttributes = extendedAttributes;

  }

(2)、WALEdit

用来表示一个事务中的更新集合

KeyValue:

private static byte [] createByteArray(final byte [] row, final int roffset,
      final int rlength, final byte [] family, final int foffset, int flength,
      final byte [] qualifier, final int qoffset, int qlength,
      final long timestamp, final Type type,
      final byte [] value, final int voffset,
      int vlength, byte[] tags, int tagsOffset, int tagsLength) {


    checkParameters(row, rlength, family, flength, qlength, vlength);
    RawCell.checkForTagsLength(tagsLength);
    // Allocate right-sized byte array.
    int keyLength = (int) getKeyDataStructureSize(rlength, flength, qlength);
    byte[] bytes = new byte[(int) getKeyValueDataStructureSize(rlength, flength, qlength, vlength,

      tagsLength)];

    // Write key, value and key row length.

    int pos = 0;
    pos = Bytes.putInt(bytes, pos, keyLength);
    pos = Bytes.putInt(bytes, pos, vlength);
    pos = Bytes.putShort(bytes, pos, (short)(rlength & 0x0000ffff));
    pos = Bytes.putBytes(bytes, pos, row, roffset, rlength);
    pos = Bytes.putByte(bytes, pos, (byte)(flength & 0x0000ff));

    if(flength != 0) {
      pos = Bytes.putBytes(bytes, pos, family, foffset, flength);
    }

    if(qlength != 0) {
      pos = Bytes.putBytes(bytes, pos, qualifier, qoffset, qlength);
    }

    pos = Bytes.putLong(bytes, pos, timestamp);

    pos = Bytes.putByte(bytes, pos, type.getCode());

    if (value != null && value.length > 0) {
      pos = Bytes.putBytes(bytes, pos, value, voffset, vlength);
    }

    // Add the tags after the value part

    if (tagsLength > 0) {
      pos = Bytes.putAsShort(bytes, pos, tagsLength);
      pos = Bytes.putBytes(bytes, pos, tags, tagsOffset, tagsLength);
    }
    return bytes;

  }

2、WAL写入流程 FSHLOG

分析WAL写入流程前,先简单看下hbase put/delete 的操作流程。核心类是HRegion.java,put/delete操作都要先找到数据所属的region,然后调用HRegion的相关方法进行操作。下面以put操作为例,简单说明:


put方法

put 方法是入口。实际执行逻辑在doBatchMutate方法中。

 

@Override
  public void put(Put put) throws IOException {
    checkReadOnly();//检查是否只读

    // Do a rough check that we have resources to accept a write.  The check is
    // 'rough' in that between the resource check and the call to obtain a
    // read lock, resources may run out.  For now, the thought is that this
    // will be extremely rare; we'll deal with it when it happens.
    checkResources();
    startRegionOperation(Operation.PUT);
    try {
      // All edits for the given row (across all column families) must happen atomically.
      doBatchMutate(put);//一行的所有列族必须原子性的修改
    } finally {
      closeRegionOperation(Operation.PUT);
    }
  }

  private void doBatchMutate(Mutation mutation) throws IOException {
    // Currently this is only called for puts and deletes, so no nonces.
    OperationStatus[] batchMutate = this.batchMutate(new Mutation[]{mutation});
    if (batchMutate[0].getOperationStatusCode().equals(OperationStatusCode.SANITY_CHECK_FAILURE)) {
      throw new FailedSanityCheckException(batchMutate[0].getExceptionMsg());
    } else if (batchMutate[0].getOperationStatusCode().equals(OperationStatusCode.BAD_FAMILY)) {
      throw new NoSuchColumnFamilyException(batchMutate[0].getExceptionMsg());
    } else if (batchMutate[0].getOperationStatusCode().equals(OperationStatusCode.STORE_TOO_BUSY)) {
      throw new RegionTooBusyException(batchMutate[0].getExceptionMsg());
    }
  }

/**
   * Perform a batch of mutations.
   *
   * It supports only Put and Delete mutations and will ignore other types passed. Operations in
   * a batch are stored with highest durability specified of for all operations in a batch,
   * except for {@link Durability#SKIP_WAL}.
   *
   * <p>This function is called from {@link #batchReplay(WALSplitUtil.MutationReplay[], long)} with
   * {@link ReplayBatchOperation} instance and {@link #batchMutate(Mutation[], long, long)} with
   * {@link MutationBatchOperation} instance as an argument. As the processing of replay batch
   * and mutation batch is very similar, lot of code is shared by providing generic methods in
   * base class {@link BatchOperation}. The logic for this method and
   * {@link #doMiniBatchMutate(BatchOperation)} is implemented using methods in base class which
   * are overridden by derived classes to implement special behavior.
   *
   * @param batchOp contains the list of mutations
   * @return an array of OperationStatus which internally contains the
   *         OperationStatusCode and the exceptionMessage if any.
   * @throws IOException if an IO problem is encountered
   */
  OperationStatus[] batchMutate(BatchOperation<?> batchOp) throws IOException {
    boolean initialized = false;
    batchOp.startRegionOperation();
    try {
      while (!batchOp.isDone()) {
        if (!batchOp.isInReplay()) {
          checkReadOnly();
        }
        checkResources();

        if (!initialized) {
          this.writeRequestsCount.add(batchOp.size());
          // validate and prepare batch for write, for MutationBatchOperation it also calls CP
          // prePut()/ preDelete() hooks
          batchOp.checkAndPrepare();
          initialized = true;
        }
        doMiniBatchMutate(batchOp);
        requestFlushIfNeeded();
      }
    } finally {
      if (rsServices != null && rsServices.getMetrics() != null) {
        rsServices.getMetrics().updateWriteQueryMeter(this.htableDescriptor.
          getTableName(), batchOp.size());
      }
      batchOp.closeRegionOperation();
    }
    return batchOp.retCodeDetails;
  }



/**
   * Called to do a piece of the batch that came in to {@link #batchMutate(Mutation[], long, long)}
   * In here we also handle replay of edits on region recover. Also gets change in size brought
   * about by applying {@code batchOp}.
   */
private void doMiniBatchMutate(BatchOperation<?> batchOp) throws IOException {
    boolean success = false;
    WALEdit walEdit = null;
    WriteEntry writeEntry = null;
    boolean locked = false;
    // We try to set up a batch in the range [batchOp.nextIndexToProcess,lastIndexExclusive)
    MiniBatchOperationInProgress<Mutation> miniBatchOp = null;
    /** Keep track of the locks we hold so we can release them in finally clause */
    List<RowLock> acquiredRowLocks = Lists.newArrayListWithCapacity(batchOp.size());
    try {
      // STEP 1. Try to acquire as many locks as we can and build mini-batch of operations with
      // locked rows
      // 添加行锁(实际是下面的读写锁)
      miniBatchOp = batchOp.lockRowsAndBuildMiniBatch(acquiredRowLocks);

      // We've now grabbed as many mutations off the list as we can
      // Ensure we acquire at least one.
      if (miniBatchOp.getReadyToWriteCount() <= 0) {
        // Nothing to put/delete -- an exception in the above such as NoSuchColumnFamily?
        return;
      }
      // 添加读写锁。
      lock(this.updatesLock.readLock(), miniBatchOp.getReadyToWriteCount());
      locked = true;

      // STEP 2. Update mini batch of all operations in progress with  LATEST_TIMESTAMP timestamp
      // We should record the timestamp only after we have acquired the rowLock,
      // otherwise, newer puts/deletes are not guaranteed to have a newer timestamp
      // 这里的时间戳获取的是  加锁以后的时间戳,保证时间戳的有效性
      long now = EnvironmentEdgeManager.currentTime();
      batchOp.prepareMiniBatchOperations(miniBatchOp, now, acquiredRowLocks);

      // STEP 3. Build WAL edit
      List<Pair<NonceKey, WALEdit>> walEdits = batchOp.buildWALEdits(miniBatchOp);

      // STEP 4. Append the WALEdits to WAL and sync.
      for(Iterator<Pair<NonceKey, WALEdit>> it = walEdits.iterator(); it.hasNext();) {
        Pair<NonceKey, WALEdit> nonceKeyWALEditPair = it.next();
        walEdit = nonceKeyWALEditPair.getSecond();
        NonceKey nonceKey = nonceKeyWALEditPair.getFirst();

        if (walEdit != null && !walEdit.isEmpty()) {
          writeEntry = doWALAppend(walEdit, batchOp.durability, batchOp.getClusterIds(), now,
              nonceKey.getNonceGroup(), nonceKey.getNonce(), batchOp.getOrigLogSeqNum());
        }

        // Complete mvcc for all but last writeEntry (for replay case)
        if (it.hasNext() && writeEntry != null) {
          mvcc.complete(writeEntry);
          writeEntry = null;
        }
      }

      // STEP 5. Write back to memStore
      // NOTE: writeEntry can be null here
      // 写 memstore
      writeEntry = batchOp.writeMiniBatchOperationsToMemStore(miniBatchOp, writeEntry);

      // STEP 6. Complete MiniBatchOperations: If required calls postBatchMutate() CP hook and
      // complete mvcc for last writeEntry
      // 完成MiniBatchOperations
      batchOp.completeMiniBatchOperations(miniBatchOp, writeEntry);
      writeEntry = null;
      success = true;
    } finally {
      // Call complete rather than completeAndWait because we probably had error if walKey != null
      if (writeEntry != null) mvcc.complete(writeEntry);

      if (locked) {
        this.updatesLock.readLock().unlock();
      }
      releaseRowLocks(acquiredRowLocks);

      final int finalLastIndexExclusive =
          miniBatchOp != null ? miniBatchOp.getLastIndexExclusive() : batchOp.size();
      final boolean finalSuccess = success;
      batchOp.visitBatchOperations(true, finalLastIndexExclusive, (int i) -> {
        batchOp.retCodeDetails[i] =
            finalSuccess ? OperationStatus.SUCCESS : OperationStatus.FAILURE;
        return true;
      });

      batchOp.doPostOpCleanupForMiniBatch(miniBatchOp, walEdit, finalSuccess);

      batchOp.nextIndexToProcess = finalLastIndexExclusive;
    }
  }

 

doMiniBatchMutate方法中体现了完整的数据put的流程,可以看到,分为以下几步:

  1. 添加读写锁
  2. 数据写入的时间以获取锁后的时间为准
  3. 构建 WALEdit
  4. 将WALEdits写WAL并且sync刷盘
  5. 写入memStore
  6. 完成MiniBatchOperations

这里主要看第三步和第四步。AsyncFSWALFSHlog 是写入WAL的 核心类:

binlog写入hbase hbase hlog解析_mvc_02

在HBase的演进过程中,HLog的写入模型几经改进,写入吞吐量得到极大提升。之前的版本中,HLog写入都需要经过三个阶段:首先将数据写入本地缓存,然后将本地缓存写入文件系统,最后执行sync操作同步到磁盘。

很显然,三个阶段是可以流水线工作的,基于这样的设想,写入模型自然就想到“生产者-消费者”队列实现。然而之前版本中,生产者之间、消费者之间以及生产者与消费者之间的线程同步都是由HBase系统实现,使用了大量的锁,在写入并发量非常大的情况下会频繁出现恶性抢占锁的问题,写入性能较差。

当前版本中,HBase使用LMAX Disruptor框架实现了无锁有界队列操作。基于Disruptor的HLog写入模型如图

binlog写入hbase hbase hlog解析_hbase_03

图中最左侧部分是Region处理HLog写入的两个前后操作:append和sync。当调用append后,WALEdit和HLogKey会被封装成FSWALEntry类,进而再封装成Ring

BufferTruck类放入Disruptor无锁有界队列中。当调用sync后,会生成一个SyncFuture,再封装成RingBufferTruck类放入同一个队列中,然后工作线程会被阻塞,等待notify()来唤醒。

图最右侧部分是消费者线程,在Disruptor框架中有且仅有一个消费者线程工作。这个框架会从Disruptor队列中依次取出RingBufferTruck对象,然后根据如下选项来操作:

·如果RingBufferTruck对象中封装的是FSWALEntry,就会执行文件append操作,将记录追加写入HDFS文件中。需要注意的是,此时数据有可能并没有实际落盘,而只是写入到文件缓存。

·如果RingBufferTruck对象是SyncFuture,会调用线程池的线程异步地批量刷盘,刷盘成功之后唤醒工作线程完成HLog的sync操作.

wal日志的写入是通过HRegion中的wal对象写入的:

/**
   * @return writeEntry associated with this append
   */
  private WriteEntry doWALAppend(WALEdit walEdit, Durability durability, List<UUID> clusterIds,
      long now, long nonceGroup, long nonce, long origLogSeqNum) throws IOException {
    Preconditions.checkArgument(walEdit != null && !walEdit.isEmpty(),
        "WALEdit is null or empty!");
    Preconditions.checkArgument(!walEdit.isReplay() || origLogSeqNum != SequenceId.NO_SEQUENCE_ID,
        "Invalid replay sequence Id for replay WALEdit!");
    // Using default cluster id, as this can only happen in the originating cluster.
    // A slave cluster receives the final value (not the delta) as a Put. We use HLogKey
    // here instead of WALKeyImpl directly to support legacy coprocessors.
    WALKeyImpl walKey = walEdit.isReplay()?
        new WALKeyImpl(this.getRegionInfo().getEncodedNameAsBytes(),
          this.htableDescriptor.getTableName(), SequenceId.NO_SEQUENCE_ID, now, clusterIds,
            nonceGroup, nonce, mvcc) :
        new WALKeyImpl(this.getRegionInfo().getEncodedNameAsBytes(),
            this.htableDescriptor.getTableName(), SequenceId.NO_SEQUENCE_ID, now, clusterIds,
            nonceGroup, nonce, mvcc, this.getReplicationScope());
    if (walEdit.isReplay()) {
      walKey.setOrigLogSeqNum(origLogSeqNum);
    }
    //don't call the coproc hook for writes to the WAL caused by
    //system lifecycle events like flushes or compactions
    if (this.coprocessorHost != null && !walEdit.isMetaEdit()) {
      this.coprocessorHost.preWALAppend(walKey, walEdit);
    }
    WriteEntry writeEntry = null;
    try {
      long txid = this.wal.appendData(this.getRegionInfo(), walKey, walEdit);
      // Call sync on our edit.
      if (txid != 0) {
        sync(txid, durability);
      }
      writeEntry = walKey.getWriteEntry();
    } catch (IOException ioe) {
      if (walKey != null && walKey.getWriteEntry() != null) {
        mvcc.complete(walKey.getWriteEntry());
      }
      throw ioe;
    }
    return writeEntry;
  }



  @Override
  public long appendData(RegionInfo info, WALKeyImpl key, WALEdit edits) throws IOException {
    return append(info, key, edits, true);
  }

  @Override
  protected long append(RegionInfo hri, WALKeyImpl key, WALEdit edits, boolean inMemstore)
      throws IOException {
      if (markerEditOnly() && !edits.isMetaEdit()) {
        throw new IOException("WAL is closing, only marker edit is allowed");
      }
    long txid = stampSequenceIdAndPublishToRingBuffer(hri, key, edits, inMemstore,
      waitingConsumePayloads);
    if (shouldScheduleConsumer()) {
      consumeExecutor.execute(consumer);
    }
    return txid;
  }

  protected final long stampSequenceIdAndPublishToRingBuffer(RegionInfo hri, WALKeyImpl key,
    WALEdit edits, boolean inMemstore, RingBuffer<RingBufferTruck> ringBuffer)
    throws IOException {
    if (this.closed) {
      throw new IOException(
        "Cannot append; log is closed, regionName = " + hri.getRegionNameAsString());
    }
    MutableLong txidHolder = new MutableLong();
    MultiVersionConcurrencyControl.WriteEntry we = key.getMvcc().begin(() -> {
      txidHolder.setValue(ringBuffer.next());//由ringBuffer生成序列号

});
//txid与上面序列号一致
    long txid = txidHolder.longValue();
    ServerCall<?> rpcCall = RpcServer.getCurrentCall().filter(c -> c instanceof ServerCall)
      .filter(c -> c.getCellScanner() != null).map(c -> (ServerCall) c).orElse(null);
try (TraceScope scope = TraceUtil.createTrace(implClassName + ".append")) {
      //创建FSWALEntry实例
      FSWALEntry entry = new FSWALEntry(txid, key, edits, hri, inMemstore, rpcCall);
      entry.stampRegionSequenceId(we);
      ringBuffer.get(txid).load(entry); //写入disruptor队列
    } finally {
      ringBuffer.publish(txid); //写入disruptor队列
    }
    return txid;
  }

上文说的KeyValue实际是封装到了WALEdit中。WALEdit中有ArrayList<Cell> cells = null,而KeyValue就是Cell的实现类

binlog写入hbase hbase hlog解析_Pair_04

Sync部分源码

public void sync(long txid, boolean forceSync) throws IOException {
    if (highestSyncedTxid.get() >= txid) {
      return;
    }
    try (TraceScope scope = TraceUtil.createTrace("AsyncFSWAL.sync")) {
      // here we do not use ring buffer sequence as txid
      long sequence = waitingConsumePayloads.next();//获取序列号

      SyncFuture future;
      try {
       //这里sync实际写入的 syncFuture,可以理解为 是一个刷盘标记
        future = getSyncFuture(txid, forceSync);
        RingBufferTruck truck = waitingConsumePayloads.get(sequence);
        truck.load(future);
      } finally {
        waitingConsumePayloads.publish(sequence);
      }
      if (shouldScheduleConsumer()) {
        consumeExecutor.execute(consumer);
      }
//写入RingBuffer后需要阻塞等待,确保刷盘成功
      blockOnSync(future);
    }
  }

  protected final void blockOnSync(SyncFuture syncFuture) throws IOException {
    // Now we have published the ringbuffer, halt the current thread until we get an answer back.
    try {
      if (syncFuture != null) {
        if (closed) {
          throw new IOException("WAL has been closed");
        } else {
//RingBufferEventHandler消费完sync的消息后会唤醒该线程
          syncFuture.get(walSyncTimeoutNs);
        }
      }
    } catch (TimeoutIOException tioe) {
      // SyncFuture reuse by thread, if TimeoutIOException happens, ringbuffer
      // still refer to it, so if this thread use it next time may get a wrong
      // result.
      this.cachedSyncFutures.remove();
      throw tioe;
    } catch (InterruptedException ie) {
      LOG.warn("Interrupted", ie);
      throw convertInterruptedExceptionToIOException(ie);
    } catch (ExecutionException e) {
      throw ensureIOException(e.getCause());
    }
  }

FSHLog.RingBufferEventHandler#onEvent的核心代码:

@Override
    // We can set endOfBatch in the below method if at end of our this.syncFutures array
    public void onEvent(final RingBufferTruck truck, final long sequence, boolean endOfBatch)
        throws Exception {
    

      try {
        if (truck.type() == RingBufferTruck.Type.SYNC) {
          this.syncFutures[this.syncFuturesCount.getAndIncrement()] = truck.unloadSync();
          // 收集一批syncFuture任务,为了提高效率。不过批次不宜太大,否则Region Server RPC服务线程阻塞在SyncFuture.get()上的时间就越长

          if (this.syncFuturesCount.get() == this.syncFutures.length) {
            endOfBatch = true;
          }
        } else if (truck.type() == RingBufferTruck.Type.APPEND) {
          FSWALEntry entry = truck.unloadAppend();
          try {
            if (this.exception != null) {
              // Return to keep processing events coming off the ringbuffer
              return;
            }
            append(entry);
          } catch (Exception e) {
            // Failed append. Record the exception.
            this.exception = e;
            cleanupOutstandingSyncsOnException(sequence,
                this.exception instanceof DamagedWALException ? this.exception
                    : new DamagedWALException("On sync", this.exception));
            // Return to keep processing events coming off the ringbuffer
            return;
          } finally {
            entry.release();
          }
        } else {
          // What is this if not an append or sync. Fail all up to this!!!
          cleanupOutstandingSyncsOnException(sequence,
            new IllegalStateException("Neither append nor sync"));
          // Return to keep processing.
          return;
        }
        if (this.exception == null) {
          if (!endOfBatch || this.syncFuturesCount.get() <= 0) {
            return;
          }
         //轮询从syncRunners中拿一个线程
          this.syncRunnerIndex = (this.syncRunnerIndex + 1) % this.syncRunners.length;
          try {
            //将要执行的任务添加到syncRunner的阻塞队列
            this.syncRunners[this.syncRunnerIndex].offer(sequence, this.syncFutures,
              this.syncFuturesCount.get());
          } catch (Exception e) {
            // Should NEVER get here.
            requestLogRoll(ERROR);
            this.exception = new DamagedWALException("Failed offering sync", e);
          }
        }
        // We may have picked up an exception above trying to offer sync
        if (this.exception != null) {
          cleanupOutstandingSyncsOnException(sequence, this.exception instanceof DamagedWALException
              ? this.exception : new DamagedWALException("On sync", this.exception));
        }
        attainSafePoint(sequence);
        this.syncFuturesCount.set(0);
      } catch (Throwable t) {
        LOG.error("UNEXPECTED!!! syncFutures.length=" + this.syncFutures.length, t);
      }
}

 

FSHLog.SyncRunner#run的核心代码:

@Override
    public void run() {
      long currentSequence;
      while (!isInterrupted()) {
        int syncCount = 0;

        try {
          SyncFuture sf;
          while (true) {
            takeSyncFuture = null;
            //从syncFutures队列中取任务执行
            takeSyncFuture = this.syncFutures.take();
            // Make local copy.
            sf = takeSyncFuture;
            currentSequence = this.sequence;
            long syncFutureSequence = sf.getTxid();
            if (syncFutureSequence > currentSequence) {
              throw new IllegalStateException("currentSequence=" + currentSequence
                  + ", syncFutureSequence=" + syncFutureSequence);
            }
            // WAL日志消费线程一次会提交多个SyncFuture。对此,SyncRunner线程只会落实执行其中最新的SyncFuture(也就是Sequence ID最大的那个)所代表的Sync操作。而忽略之前的SyncFuture。
            long currentHighestSyncedSequence = highestSyncedTxid.get();
            if (currentSequence < currentHighestSyncedSequence) {
              syncCount += releaseSyncFuture(sf, currentHighestSyncedSequence, null);
              // Done with the 'take'. Go around again and do a new 'take'.
              continue;
            }
            break;
          }
          long start = System.nanoTime();
          Throwable lastException = null;
          try {
            TraceUtil.addTimelineAnnotation("syncing writer");
            long unSyncedFlushSeq = highestUnsyncedTxid;
//它的实现在 ProtobufLogWriter.sync ,调用FSDataOutputStream的hsync或hflush方法刷盘
            writer.sync(sf.isForceSync());
            TraceUtil.addTimelineAnnotation("writer synced");
            if (unSyncedFlushSeq > currentSequence) {
              currentSequence = unSyncedFlushSeq;
            }
            currentSequence = updateHighestSyncedSequence(currentSequence);
          } catch (IOException e) {
            LOG.error("Error syncing, request close of WAL", e);
            lastException = e;
          } catch (Exception e) {
            LOG.warn("UNEXPECTED", e);
            lastException = e;
          } finally {
            //唤醒等待的线程
            syncCount += releaseSyncFuture(takeSyncFuture, currentSequence, lastException);
            // Can we release other syncs?
            syncCount += releaseSyncFutures(currentSequence, lastException);
            if (lastException != null) {
              requestLogRoll(ERROR);
            } else {
              checkLogRoll();
            }
          }
          postSync(System.nanoTime() - start, syncCount);
        } catch (InterruptedException e) {
          // Presume legit interrupt.
          Thread.currentThread().interrupt();
        } catch (Throwable t) {
          LOG.warn("UNEXPECTED, continuing", t);
        }
      }
    }
  }