spring-cloud gateway 網(wǎng)關(guān)調(diào)優(yōu)
網(wǎng)關(guān)線程數(shù)的增加,對(duì)吞吐量有較大提升;
網(wǎng)關(guān)對(duì)CPU要求較高,建議提升CPU性能,但需要權(quán)衡單臺(tái)高配和多臺(tái)低配的整體性能對(duì)比;
網(wǎng)關(guān)對(duì)內(nèi)存、硬盤(pán)要求較低;
在吞吐量追求和CPU負(fù)載升高之間,做權(quán)衡選擇機(jī)器配置;
reactor.netty.ioWorkerCount參數(shù)調(diào)整netty工作線程數(shù),在文件reactor.netty.ReactorNetty中

Spring Cloud Gateway 工作原理
找到源碼
org.springframework.cloud.gateway.handler.RoutePredicateHandlerMapping
再看RoutePredicateHandlerMapping#lookupRoute的實(shí)現(xiàn)
protected?Mono?lookupRoute(ServerWebExchange?exchange)?{
????????return?this.routeLocator
????????????????.getRoutes()
????????????????//individually?filter?routes?so?that?filterWhen?error?delaying?is?not?a?problem
????????????????.concatMap(route?->?Mono
????????????????????????.just(route)
????????????????????????.filterWhen(r?->?{
????????????????????????????//?add?the?current?route?we?are?testing
????????????????????????????exchange.getAttributes().put(GATEWAY_PREDICATE_ROUTE_ATTR,?r.getId());
????????????????????????????return?r.getPredicate().apply(exchange);
????????????????????????})
????????????????????????//instead?of?immediately?stopping?main?flux?due?to?error,?log?and?swallow?it
????????????????????????.doOnError(e?->?logger.error("Error?applying?predicate?for?route:?"+route.getId(),?e))
????????????????????????.onErrorResume(e?->?Mono.empty())
????????????????)
????????????????//?.defaultIfEmpty()?put?a?static?Route?not?found
????????????????//?or?.switchIfEmpty()
????????????????//?.switchIfEmpty(Mono.empty().log("noroute"))
????????????????.next()
????????????????//TODO:?error?handling
????????????????.map(route?->?{
????????????????????if?(logger.isDebugEnabled())?{
????????????????????????logger.debug("Route?matched:?"?+?route.getId());
????????????????????}
????????????????????validateRoute(route,?exchange);
????????????????????return?route;
????????????????});
????????/*?TODO:?trace?logging
????????????if?(logger.isTraceEnabled())?{
????????????????logger.trace("RouteDefinition?did?not?match:?"?+?routeDefinition.getId());
????????????}*/
????}
遍歷所有的路由規(guī)則直到找到一個(gè)符合的,路由過(guò)多是排序越往后自然越慢,但是也考慮到地方項(xiàng)目只有10個(gè),但是我們還是試一試。
我們把這部分源碼抽出來(lái)自己修改一下,先寫(xiě)死一個(gè)路由
protected?Mono?lookupRoute(ServerWebExchange?exchange)?{
????????if?(this.routeLocator?instanceof?CachingRouteLocator)?{
????????????CachingRouteLocator?cachingRouteLocator?=?(CachingRouteLocator)?this.routeLocator;
????????????//?這里的getRouteMap()也是新加的方法
????????????return?cachingRouteLocator.getRouteMap().next().map(map?->
????????????????????map.get(“api-user”))
????????????????????//這里寫(xiě)死一個(gè)路由id
????????????????????.switchIfEmpty(matchRoute(exchange));
????????}
????????return?matchRoute(exchange);
????}
重新壓測(cè)后速度提升了10倍,cpu也只有在請(qǐng)求進(jìn)入時(shí)較高,但是仍然存在被拒絕的請(qǐng)求以及卡頓。
于是根據(jù)這個(gè)情況以及我們實(shí)際設(shè)定的路由規(guī)則,在請(qǐng)求進(jìn)入時(shí)對(duì)重要參數(shù)以及path進(jìn)行hash保存下次進(jìn)入時(shí)不再走原來(lái)的判斷邏輯。
protected?Mono?lookupRoute(ServerWebExchange?exchange)?{
????????//String?md5Key?=?getMd5Key(exchange);
????????String?appId?=?exchange.getRequest().getHeaders().getFirst("M-Sy-AppId");
????????String?serviceId?=?exchange.getRequest().getHeaders().getFirst("M-Sy-Service");?????
????????String?token?=?exchange.getRequest().getHeaders().getFirst("M-Sy-Token");
????????String?path?=?exchange.getRequest().getURI().getRawPath();
????????StringBuilder?value?=?new?StringBuilder();
????????String?md5Key?=?"";
????????if(StringUtils.isNotBlank(token))?{
????????????try?{
????????????????Map<String,?Object>?params?=??(Map<String,?Object>)?redisTemplate.opsForValue().get("token:"?+?token);
????????????????if(null?!=params?&&?!params.isEmpty())?{
????????????????????JSONObject?user?=?JSONObject.parseObject(params.get("user").toString());
????????????????????appId?=?user.getString("appId");
????????????????????serviceId?=?user.getString("serviceid");
????????????????}
????????????}catch(Exception?e)?{
????????????????e.printStackTrace();
????????????}
????????}
????????if(StringUtils.isBlank(appId)?||?StringUtils.isBlank(serviceId))?{
????????????md5Key?=?DigestUtils.md5Hex(path);
????????}else?{
????????????value.append(appId);
????????????value.append(serviceId);
????????????value.append(path);
????????????md5Key?=?DigestUtils.md5Hex(value.toString());
????????}
????????if?(logger.isDebugEnabled())?{
????????????logger.info("Route?matched?before:?"?+?routes.containsKey(md5Key));
????????}
????????if?(?routes.containsKey(md5Key)
?????????&&?this.routeLocator?instanceof?CachingRouteLocator)?{
????????????final?String?key?=?md5Key;
????????????CachingRouteLocator?cachingRouteLocator?=?(CachingRouteLocator)?this.routeLocator;
????????????//?注意,這里的getRouteMap()也是新加的方法
????????????return?cachingRouteLocator.getRouteMap().next().map(map?->
????????????????????map.get(routes.get(key)))
????????????????????//?這里保證如果適配不到,仍然走老的官方適配邏輯
????????????????????.switchIfEmpty(matchRoute(exchange,md5Key));
????????}
????????return?matchRoute(exchange,md5Key);
????}
????private?Mono?matchRoute(ServerWebExchange?exchange,String?md5Key)?{
????????//String?md5Key?=?getMd5Key(exchange);
????????return?this.routeLocator
????????????????.getRoutes()
????????????????//individually?filter?routes?so?that?filterWhen?error?delaying?is?not?a?problem
????????????????.concatMap(route?->?Mono
????????????????????????.just(route)
????????????????????????.filterWhen(r?->?{
????????????????????????????//?add?the?current?route?we?are?testing
????????????????????????????exchange.getAttributes().put(GATEWAY_PREDICATE_ROUTE_ATTR,?r.getId());
????????????????????????????return?r.getPredicate().apply(exchange);
????????????????????????})
????????????????????????//instead?of?immediately?stopping?main?flux?due?to?error,?log?and?swallow?it
????????????????????????.doOnError(e?->?logger.error("Error?applying?predicate?for?route:?"+route.getId(),?e))
????????????????????????.onErrorResume(e?->?Mono.empty())
????????????????)
????????????????//?.defaultIfEmpty()?put?a?static?Route?not?found
????????????????//?or?.switchIfEmpty()
????????????????//?.switchIfEmpty(Mono.empty().log("noroute"))
????????????????.next()
????????????????//TODO:?error?handling
????????????????.map(route?->?{
????????????????????if?(logger.isDebugEnabled())?{
????????????????????????logger.debug("Route?matched:?"?+?route.getId());
????????????????????????logger.debug("緩存"+routes.get(md5Key));
????????????????????}
????????????????????//?redisTemplate.opsForValue().set(ROUTE_KEY+md5Key,??route.getId(),?5,?TimeUnit.MINUTES);
????????????????????routes.put(md5Key,?route.getId());
????????????????????validateRoute(route,?exchange);
????????????????????return?route;
????????????????});
????????/*?TODO:?trace?logging
????????????if?(logger.isTraceEnabled())?{
????????????????logger.trace("RouteDefinition?did?not?match:?"?+?routeDefinition.getId());
????????????}*/
????}
此次修改后路由有了一個(gè)較大的提升,開(kāi)始繼續(xù)分析拒絕請(qǐng)求以及卡頓問(wèn)題。
考慮到是不是netty依據(jù)電腦的配置做了限制?在自己的筆記本上限制連接在200左右,在服務(wù)器上在2000左右
查了許多資料發(fā)現(xiàn)netty的對(duì)外配置并不是很多,不像tomcat、undertow等等
目前使用的scg版本較舊沒(méi)有辦法將netty修改為tomcat或者undertow,于是我在官網(wǎng)下載了最新的scg并將啟動(dòng)容器修改為tomcat和undertow依次進(jìn)行了嘗試,發(fā)現(xiàn)都沒(méi)有200的限制。
然后開(kāi)始查找netty方面的資料,發(fā)現(xiàn)了reactor.ipc.netty.workerCount
DEFAULT_IO_WORKER_COUNT:如果環(huán)境變量有設(shè)置reactor.ipc.netty.workerCount,則用該值;沒(méi)有設(shè)置則取Math.max(Runtime.getRuntime().availableProcessors(), 4)))
JSONObject?message?=?new?JSONObject();
try?{
???????????Thread.sleep(30000);
}?catch?(InterruptedException?e)?{
???????????//?TODO?Auto-generated?catch?block
???????????e.printStackTrace();
}
ServerHttpResponse?response?=?exchange.getResponse();
message.put("code",?4199);
message.put("msg",?"模擬堵塞");
byte[]?bits?=?message.toJSONString().getBytes(StandardCharsets.UTF_8);
DataBuffer?buffer?=?response.bufferFactory().wrap(bits);
response.setStatusCode(HttpStatus.UNAUTHORIZED);
?//?指定編碼,否則在瀏覽器中會(huì)中文亂碼
?response.getHeaders().add("Content-Type",?"application/json;charset=UTF-8");
?return?response.writeWith(Mono.just(buffer));
通過(guò)模擬堵塞測(cè)試,發(fā)現(xiàn)該參數(shù)用于控制接口的返回?cái)?shù)量,這應(yīng)該就是壓測(cè)時(shí)接口卡頓返回的原因了,通過(guò)壓測(cè)發(fā)現(xiàn)該參數(shù)在16核cpu的3倍時(shí)表現(xiàn)已經(jīng)較好。16核cpu4倍時(shí)單機(jī)scg壓測(cè)時(shí)沒(méi)有卡頓,但是單機(jī)壓15000時(shí)cpu大概在70-80。
通過(guò)找到該原因,懷疑人生的自己重拾信心通過(guò)百度reactor.ipc.netty.workerCount發(fā)現(xiàn)了另一個(gè)參數(shù)reactor.ipc.netty.selectCount
DEFAULT_IO_SELECT_COUNT:如果環(huán)境變量有設(shè)置reactor.ipc.netty.selectCount,則用該值;沒(méi)有設(shè)置則取-1,表示沒(méi)有selector thread
找到源碼reactor.ipc.netty.resources.DefaultLoopResources
看到這段代碼
if?(selectCount?==?-1)?{
????????????this.selectCount?=?workerCount;
????????????this.serverSelectLoops?=?this.serverLoops;
????????????this.cacheNativeSelectLoops?=?this.cacheNativeServerLoops;
??}else?{
????????????this.selectCount?=?selectCount;
????????????this.serverSelectLoops?=
????????????????????new?NioEventLoopGroup(selectCount,?threadFactory(this,?"select-nio"));
????????????this.cacheNativeSelectLoops?=?new?AtomicReference<>();
}
歷經(jīng)漫長(zhǎng)的懷疑人生與越挫越勇(并沒(méi)有),總共修改了2處,達(dá)成了一個(gè)10倍提升的小目標(biāo)
總結(jié)
修改原生路由查找邏輯
設(shè)置系統(tǒng)變量reactor.ipc.netty.workerCount為cpu核數(shù)的3倍或4倍;設(shè)置reactor.ipc.netty.selectCount的值為1(只要不是-1即可)
另外,httpclient的配置情況可以參考o(jì)rg.springframework.cloud.gateway.config.GatewayAutoConfiguration.NettyConfiguration
source:?www.icode9.com/content-4-1057716.html
喜歡,在看
