简介
DataTransferThrottler类别Datanode读取和写入数据时控制传输数据速率。这个类是线程安全的,它可以由多个线程共享。
用途是构建DataTransferThrottler对象,并设置期限period和带宽bandwidthPerSec,际读写前调用DataTransferThrottler.throttle()方法。假设I/O的速率相对给定的带宽太快,则该方法会将当前线程wait。
构造函数
双參构造函数,能够设置周期period和带宽bandwidthPerSec。
/**
* Constructor
* @param period in milliseconds. Bandwidth is enforced over this
* period.
* @param bandwidthPerSec bandwidth allowed in bytes per second.
*/
public DataTransferThrottler(long period, long bandwidthPerSec) {
this.curPeriodStart = monotonicNow();
this.period = period;
this.curReserve = this.bytesPerPeriod = bandwidthPerSec*period/1000;
this.periodExtension = period*3;
}
单參构造函数,能够设置带宽bandwidthPerSec, 周期period默认被设置为500ms。
/**
* @return current throttle bandwidth in bytes per second.
*/
public synchronized long getBandwidth() {
return bytesPerPeriod*1000/period;
}
重要属性
period 周期,单位毫秒
periodExtension 周期扩展时间。单位毫秒
bytesPerPeriod 一个周期内能够发送/接收的byte总数
curPeriodStart 当前周期开始时间,单位毫秒
curReserve 当前周期内还能够发送/接收的byte数
bytesAlreadyUsed 当前周期内已经使用的byte数
DataTransferThrottler.throttle()
DataTransferThrottler.throttle()方法会循环判断请求发送数据量是否比剩余数据量小, 如
果throttle()方法能满足请求量则立即返回,调用线程就可以立即执行发送/ 接收数据的操作; 否则, DataTransferThrottler.throttle()方法会循环等待检查周期结束, 并在每个周期结束时增加剩余请求量(每个周期添加bytesPerPeriod) , 直到当前请求量得到满足时throttle()方法才会退出。
/** Given the numOfBytes sent/received since last time throttle was called,
* make the current thread sleep if I/O rate is too fast
* compared to the given bandwidth. Allows for optional external cancelation.
*
* @param numOfBytes
* number of bytes sent/received since last time throttle was called
* @param canceler
* optional canceler to check for abort of throttle
*/
public synchronized long throttle(long numOfBytes, Canceler canceler) {
if ( numOfBytes <= 0 ) {
return 0;
}
long currentWaitTime = 0;
// 当前周期余量减去须要发送/接收的byte数numOfBytes
curReserve -= numOfBytes;
bytesAlreadyUsed += numOfBytes;
// 假设curReserve小于等于0,则说明当前周期已经没有余量,要开始限流了!
while (curReserve <= 0) {
// 假设传入了有效取消器canceler,而且取消器的取消状态isCancelled是true,则直接退出while循环
if (canceler != null && canceler.isCancelled()) {
return currentWaitTime;
}
long now = monotonicNow();
// 计算当前周期结束时间。并存放在curPeriodEnd变量中
long curPeriodEnd = curPeriodStart + period;
if ( now < curPeriodEnd ) {
// Wait for next period so that curReserve can be increased.
// 等待下一个周期,这样curReserve就能够添加
totalWaitCount.incrementAndGet();
long start = Time.monotonicNow();
try {
wait( curPeriodEnd - now );
} catch (InterruptedException e) {
// Abort throttle and reset interrupted status to make sure other
// interrupt handling higher in the call stack executes.
// 终止throttle, 而且重置interrupted状态来确保在调用栈中其他interrupt处理器能够正确运行
Thread.currentThread().interrupt();
break;
} finally {
long wait = Time.monotonicNow() - start;
currentWaitTime += wait;
totalWaitTime.addAndGet(wait);
}
// 假设当前时间now比当前结束时间curPeriodEnd晚,而且小于curPeriodStart+periodExtension(周期3倍时间),则进入下一个周期
// 并添加bytesPerPeriod到curReserve
} else if ( now < (curPeriodStart + periodExtension)) {
curPeriodStart = curPeriodEnd;
// 增加当前周期余量
curReserve += bytesPerPeriod;
// 假设当前时间now大于curPeriodStart+periodExtension,则可能Throttler非常长时间没有使用。抛弃上一个周期
} else {
// discard the prev period. Throttler might not have
// been used for a long time.
curPeriodStart = now;
curReserve = bytesPerPeriod - bytesAlreadyUsed;
}
}
bytesAlreadyUsed -= numOfBytes;
return currentWaitTime;
}
代码实现
(记录方法,非项目源码,有省略和简化)
String limit=null;
//此处配置流量限制(带宽)
DataTransferThrottler throttler = new DataTransferThrottler(Long.valueOf(limit)*1024*1024);
//源文件系统
FSDataInputStream inputStream = srcHdfsFileSystem.open(srcFilePath,4096);
//目标文件系统
FSDataOutputStream outputStream = argetHdfsFileSystem.create(targetFilePath,true);
//io流复制文件
//高效字节流读取
BufferedInputStream bufferedInputStream = new BufferedInputStream(inputStream);
//高效字节流写入
BufferedOutputStream bufferedOutputStream = new BufferedOutputStream(outputStream);
int len=0;
//缓冲区
byte[] buf = new byte[2048 * 2];
//循环读取
while ((len=bufferedInputStream.read(buf)) != -1){
bufferedOutputStream.write(buf,0,len);
// throttler在此处监控流量
throttler.throttle(len);
}
//关流...
问题记录
查看官方源码可以看到在 DataTransferThrottler 中配置的带宽是以500ms为单位进行限流,即每秒是配置数值的两倍,但在实际测试中配置的数值时间单位即为每秒(通过getBandwidth方法和文件大小/时间计算验证),两倍时测速有误。
/** Constructor
* @param bandwidthPerSec bandwidth allowed in bytes per second.
*/
public DataTransferThrottler(long bandwidthPerSec) {
this(500, bandwidthPerSec); // by default throttling period is 500ms
}