壹、入围方案

Sentinel

阿里出品,Spring Cloud Alibaba限流组件,目前持续更新中

自带Dashboard,可以查看接口Qps等,并且可以动态修改各种规则

流量控制,直接限流、冷启动、排队

熔断降级,限制并发限制数和相应时间

系统负载保护,提供系统级别防护,限制总体CPU等

主要核心:资源,规则(流量控制规则、熔断降级规则、系统保护规则、来源访问控制规则 和 热点参数规则。),和指标

文档非常清晰和详细,中文

支持动态规则(推模式和拉模式)

Hystrix

Netflix出品,Spring Cloud Netflix限流组件,已经停止新特性开发,只进行bug修复,最近更新为2018年,功能稳定

有简单的dashboard页面

以隔离和熔断为主的容错机制,超时或被熔断的调用将会快速失败,并可以提供 fallback 机制的初代熔断框架,异常统计基于滑动窗口

resilience4j

是一款轻量、简单,并且文档非常清晰、丰富的熔断工具。是Hystrix替代品,实现思路和Hystrix一致,目前持续更新中

需要自己对micrometer、prometheus以及Dropwizard metrics进行整合

CircuitBreaker 熔断

Bulkhead 隔离

RateLimiter QPS限制

Retry 重试

TimeLimiter 超时限制

Cache 缓存

自己实现(基于Guava)

基于Guava的令牌桶,可以轻松实现对QPS进行限流

贰、技术对比

java 熔断实现方案 java熔断框架_java 熔断实现方案

叁、应用改造

3.1、sentinel

3.1.1、引入依赖

com.alibaba.cloud

spring-cloud-starter-alibaba-sentinel

2.0.3.RELEASE

3.1.2、改造接口或者service层

@SentinelResource(value = "allInfos",fallback = "errorReturn")
@Target({ElementType.METHOD, ElementType.TYPE})
@Retention(RetentionPolicy.RUNTIME)
@Inherited
public @interface SentinelResource {
//资源名称
String value() default "";
//流量方向
EntryType entryType() default EntryType.OUT;
//资源类型
int resourceType() default 0;
//异常处理方法
String blockHandler() default "";
//异常处理类
Class>[] blockHandlerClass() default {};
//熔断方法
String fallback() default "";
//默认熔断方法
String defaultFallback() default "";
//熔断类
Class>[] fallbackClass() default {};
//统计异常
Class extends Throwable>[] exceptionsToTrace() default {Throwable.class};
//忽略异常
Class extends Throwable>[] exceptionsToIgnore() default {};
}
@RequestMapping("/get")
@ResponseBody
@SentinelResource(value = "allInfos",fallback = "errorReturn")
public JsonResult allInfos(HttpServletRequest request, HttpServletResponse response, @RequestParam Integer num){
try {
if (num % 2 == 0) {
log.info("num % 2 == 0");
throw new BaseException("something bad with 2", 400);
}
return JsonResult.ok();
} catch (ProgramException e) {
log.info("error");
return JsonResult.error("error");
}
}

3.1.3、针对接口配置熔断方法或者限流方法

默认过滤拦截所有Controller接口

/**
* 限流,参数需要和方法保持一致
* @param request
* @param response
* @param num
* @return
* @throws BlockException
*/
public JsonResult errorReturn(HttpServletRequest request, HttpServletResponse response, @RequestParam Integer num) throws BlockException {
return JsonResult.error("error 限流" + num );
}
/**
* 熔断,参数需要和方法保持一直,并且需要添加BlockException异常
* @param request
* @param response
* @param num
* @param b
* @return
* @throws BlockException
*/
public JsonResult errorReturn(HttpServletRequest request, HttpServletResponse response, @RequestParam Integer num,BlockException b) throws BlockException {
return JsonResult.error("error 熔断" + num );
}

注意也可以不配置限流或者熔断方法。通过全局异常去捕获UndeclaredThrowableException或者BlockException避免大量的开发量

3.1.4、接入dashboard

spring:

cloud:

sentinel:

transport:

port: 8719

dashboard: localhost:8080

java 熔断实现方案 java熔断框架_java各层级限流对比_02

3.1.5、规则持久化和动态更新

接入配置中心如:zookeeper等等,并对规则采用推模式

3.2、hystrix

3.2.1、引入依赖

org.springframework.boot

spring-boot-starter-actuator

org.springframework.cloud

spring-cloud-starter-netflix-hystrix-dashboard

2.0.4.RELEASE

org.springframework.cloud

spring-cloud-starter-netflix-hystrix

2.0.4.RELEASE

3.2.2、改造接口

@HystrixCommand(fallbackMethod = "timeOutError")
@Target({ElementType.METHOD})
@Retention(RetentionPolicy.RUNTIME)
@Inherited
@Documented
public @interface HystrixCommand {
String groupKey() default "";
String commandKey() default "";
String threadPoolKey() default "";
String fallbackMethod() default "";
HystrixProperty[] commandProperties() default {};
HystrixProperty[] threadPoolProperties() default {};
Class extends Throwable>[] ignoreExceptions() default {};
ObservableExecutionMode observableExecutionMode() default ObservableExecutionMode.EAGER;
HystrixException[] raiseHystrixExceptions() default {};
String defaultFallback() default "";
}
@RequestMapping("/get")
@ResponseBody
@HystrixCommand(fallbackMethod = "fallbackMethod")
public JsonResult allInfos(HttpServletRequest request, HttpServletResponse response, @RequestParam Integer num){
try {
if (num % 3 == 0) {
log.info("num % 3 == 0");
throw new BaseException("something bad whitch 3", 400);
}
return JsonResult.ok();
} catch (ProgramException | InterruptedException exception) {
log.info("error");
return JsonResult.error("error");
}
}
3.2.3、针对接口配置熔断方法
/**
* 该方法是熔断回调方法,参数需要和接口保持一致
* @param request
* @param response
* @param num
* @return
*/
public JsonResult fallbackMethod(HttpServletRequest request, HttpServletResponse response, @RequestParam Integer num) {
response.setStatus(500);
log.info("发生了熔断!!");
return JsonResult.error("熔断");
}

3.2.4、配置默认策略

hystrix:

command:

default:

execution:

isolation:

strategy: THREAD

thread:

# 线程超时15秒,调用Fallback方法

timeoutInMilliseconds: 15000

metrics:

rollingStats:

timeInMilliseconds: 15000

circuitBreaker:

# 10秒内出现3个以上请求(已临近阀值),并且出错率在50%以上,开启断路器.断开服务,调用Fallback方法

requestVolumeThreshold: 3

sleepWindowInMilliseconds: 10000

3.2.5、接入监控

java 熔断实现方案 java熔断框架_github_03

java 熔断实现方案 java熔断框架_spring_04

曲线:用来记录2分钟内流量的相对变化,我们可以通过它来观察到流量的上升和下降趋势。

集群监控需要用到注册中心

3.3、resilience4j

3.3.1、引入依赖

dependency>

org.springframework.boot

spring-boot-starter-web

org.springframework.boot

spring-boot-starter-test

test

io.github.resilience4j

resilience4j-spring-boot2

1.6.1

io.github.resilience4j

resilience4j-bulkhead

1.6.1

io.github.resilience4j

resilience4j-ratelimiter

1.6.1

io.github.resilience4j

resilience4j-timelimiter

1.6.1

可以按需要引入:bulkhead,ratelimiter,timelimiter等

3.3.2、改造接口

@RequestMapping("/get")
@ResponseBody
//@TimeLimiter(name = "BulkheadA",fallbackMethod = "fallbackMethod")
@CircuitBreaker(name = "BulkheadA",fallbackMethod = "fallbackMethod")
@Bulkhead(name = "BulkheadA",fallbackMethod = "fallbackMethod")
public JsonResult allInfos(HttpServletRequest request, HttpServletResponse response, @RequestParam Integer num){
log.info("param----->" + num);
try {
//Thread.sleep(num);
if (num % 2 == 0) {
log.info("num % 2 == 0");
throw new BaseException("something bad with 2", 400);
}
if (num % 3 == 0) {
log.info("num % 3 == 0");
throw new BaseException("something bad whitch 3", 400);
}
if (num % 5 == 0) {
log.info("num % 5 == 0");
throw new ProgramException("something bad whitch 5", 400);
}
if (num % 7 == 0) {
log.info("num % 7 == 0");
int res = 1 / 0;
}
return JsonResult.ok();
} catch (BufferUnderflowException e) {
log.info("error");
return JsonResult.error("error");
}
}

3.3.3、针对接口配置熔断方法

/**
* 需要参数一致,并且加上相应异常
* @param request
* @param response
* @param num
* @param exception
* @return
*/
public JsonResult fallbackMethod(HttpServletRequest request, HttpServletResponse response, @RequestParam Integer num, BulkheadFullException exception) {
return JsonResult.error("error 熔断" + num );
}

3.3.4、配置规则

resilience4j.circuitbreaker:
instances:
backendA:
registerHealthIndicator: true
slidingWindowSize: 100
backendB:
registerHealthIndicator: true
slidingWindowSize: 10
permittedNumberOfCallsInHalfOpenState: 3
slidingWindowType: TIME_BASED
minimumNumberOfCalls: 20
waitDurationInOpenState: 50s
failureRateThreshold: 50
eventConsumerBufferSize: 10
recordFailurePredicate: io.github.robwin.exception.RecordFailurePredicate
resilience4j.retry:
instances:
backendA:
maxRetryAttempts: 3
waitDuration: 10s
enableExponentialBackoff: true
exponentialBackoffMultiplier: 2
retryExceptions:
- org.springframework.web.client.HttpServerErrorException
- java.io.IOException
ignoreExceptions:
- io.github.robwin.exception.BusinessException
backendB:
maxRetryAttempts: 3
waitDuration: 10s
retryExceptions:
- org.springframework.web.client.HttpServerErrorException
- java.io.IOException
ignoreExceptions:
- io.github.robwin.exception.BusinessException
resilience4j.bulkhead:
instances:
backendA:
maxConcurrentCalls: 10
backendB:
maxWaitDuration: 10ms
maxConcurrentCalls: 20
resilience4j.thread-pool-bulkhead:
instances:
backendC:
maxThreadPoolSize: 1
coreThreadPoolSize: 1
queueCapacity: 1
resilience4j.ratelimiter:
instances:
backendA:
limitForPeriod: 10
limitRefreshPeriod: 1s
timeoutDuration: 0
registerHealthIndicator: true
eventConsumerBufferSize: 100
backendB:
limitForPeriod: 6
limitRefreshPeriod: 500ms
timeoutDuration: 3s
resilience4j.timelimiter:
instances:
backendA:
timeoutDuration: 2s
cancelRunningFuture: true
backendB:
timeoutDuration: 1s
cancelRunningFuture: false

配置的规则可以被代码覆盖

3.3.5、配置监控

如grafana等

肆、关注点

是否需要过滤部分异常

是否需要全局默认规则

可能需要引入其他中间件

k8s流量控制

规则存储和动态修改

接入改造代价