別再亂打日志了,這樣才是定位 bug 打日志的方式!
概述
日常工作中,程序員需要經(jīng)常處理線上的各種大小故障,如果業(yè)務(wù)代碼沒打印日志或者日志打印的不好,會(huì)極大的加大了定位問題的難度,使得解決bug的時(shí)間變長了。對(duì)于那種影響比較大的bug,處理時(shí)間是分秒必爭的,慢幾秒處理完,可能GMV就嘩啦啦的掉了很多。
一個(gè)程序員是否優(yōu)秀,其中一個(gè)判斷維度就是:處理線上問題是否快狠準(zhǔn),而其中日志是幫我們快速定位問題的絕佳手段。
下面分享一下筆者平時(shí)在業(yè)務(wù)系統(tǒng)里記日志的一些手法和習(xí)慣,希望對(duì)大家有一些幫助。
請(qǐng)統(tǒng)一日志格式
日志格式最好是統(tǒng)一的,即方便查看定位問題又方便統(tǒng)計(jì)收集。我一般喜歡定義一個(gè)LogObject對(duì)象,里面定義日志的各個(gè)字段。例如:
import?com.fasterxml.jackson.annotation.JsonInclude;
import?com.fasterxml.jackson.annotation.JsonInclude.Include;
import?com.fasterxml.jackson.annotation.JsonProperty;
public?class?LogObject?{
????@JsonProperty(index?=?1)
????private?String?eventName;
????@JsonProperty(index?=?2)
????private?String?traceId;
????@JsonProperty(index?=?3)
????private?String?msg;
????@JsonProperty(index?=?4)
????private?long?costTime;
????@JsonProperty(index?=?6)
????private?Integer?userId;
????@JsonProperty(index?=?7)
????private?Object?others;
????@JsonProperty(index?=?8)
????private?Object?request;
????@JsonProperty(index?=?9)
????private?Object?response;
????public?String?getEventName()?{
????????return?eventName;
????}
????public?LogObject?setEventName(String?eventName)?{
????????this.eventName?=?eventName;
????????return?this;
????}
????public?Object?getRequest()?{
????????return?request;
????}
????public?LogObject?setRequest(Object?request)?{
????????this.request?=?request;
????????return?this;
????}
????public?Object?getResponse()?{
????????return?response;
????}
????public?LogObject?setResponse(Object?response)?{
????????this.response?=?response;
????????return?this;
????}
????public?String?getMsg()?{
????????return?msg;
????}
????public?LogObject?setMsg(String?msg)?{
????????this.msg?=?msg;
????????return?this;
????}
????public?long?getCostTime()?{
????????return?costTime;
????}
????public?LogObject?setCostTime(long?costTime)?{
????????this.costTime?=?costTime;
????????return?this;
????}
????public?Integer?getUserId()?{
????????return?userId;
????}
????public?LogObject?setUserId(Integer?userId)?{
????????this.userId?=?userId;
????????return?this;
????}
????public?Object?getOthers()?{
????????return?others;
????}
????public?LogObject?setOthers(Object?others)?{
????????this.others?=?others;
????????return?this;
????}
????public?String?getTraceId()?{
????????return?traceId;
????}
????public?LogObject?setTraceId(String?traceId)?{
????????this.traceId?=?traceId;
????????return?this;
????}
traceId: 調(diào)用鏈ideventName: 事件名稱,一般就是業(yè)務(wù)方法名稱userId: C端用戶idmsg: 結(jié)果消息costTime: 接口響應(yīng)時(shí)間request: 接口請(qǐng)求入?yún)?/section>response: 接口返回值others: 其他業(yè)務(wù)參數(shù)
使用鏈?zhǔn)降娘L(fēng)格,方便設(shè)置字段的值:
????????????long?endTime?=?System.currentTimeMillis();
????????????LogObject?logObject?=?new?LogObject();
????????????logObject.setEventName(methodName)
?????????????????????.setMsg(msg)
?????????????????????.setTraceId(traceId)
?????????????????????.setUserId(backendId)
?????????????????????.setRequest(liveRoomPushOrderReqDto)
?????????????????????.setResponse(response)
?????????????????????.setCostTime((endTime?-?beginTime));
????????????LOGGER.info(JSON.toJSONString(logObject));
當(dāng)然最好還是封裝出一個(gè)工具類出來,例如叫:LogTemplate,作為一個(gè)統(tǒng)一的入口。另外可以使用JsonProperty注解,指定字段的順序,例如通過index=1,將eventName放置在最前面。
????@JsonProperty(index?=?1)
????private?String?eventName;
將request和response放置在一起
將請(qǐng)求和返回值,放置在同一條日志里,有個(gè)好處,就是非常方便查看上下文日志。如果打印成兩條,返回值那條可能被沖到很后面,而且也得再做一次grep操作,影響效率。具體的日志如下:
{
???"eventName":"createOrder",
???"traceId":"createOrder_1574923602015",
???"msg":"success",
???"costTime":317,
???"request":{
??????"uId":111111111,
??????"skuList":[
?????????{
????????????"skuId":22222222,
????????????"buyNum":1,
????????????"buyPrice":8800,
?????????}
??????]
???},
???"response":{
??????"code":0,
??????"message":"操作成功",
??????"data":{
?????????"bigOrderId":"BIG2019",
?????????"m2LOrderIds":{
????????????"MID2019":{
???????????????"22222222":"LIT2019"
????????????}
?????????}
??????}
???}
}
為了能拼成一條,有兩種方案,一種是比較low的,直接在代碼里使用try catch finally,例如:
?@PostMapping(value?=?"/createOrder")
????public?JsonResult?createOrder(@RequestBody?Object?request)?throws?Exception?{
????????String?methodName?=?"/createOrder";
????????Integer?backendId?=?null;
????????String?msg?=?"success";
????????long?beginTime?=?System.currentTimeMillis();
????????String?traceId?=?"createOrder_"+beginTime;
????????JsonResult?response?=?null;
????????try?{
????????????OrderCreateRsp?orderCreateRsp?=?orderOperateService.createOrder(request,?traceId);
????????????response?=?JsonResult.success(orderCreateRsp);
????????}
????????catch?(Exception?e)?{
????????????msg?=?e.getMessage();
????????????LOGGER.error(methodName+",userId:"+backendId+",request:"+?JsonHelper.toJson(request),e);
????????????throw?new?BizException(0,"下單失敗");
????????}
????????finally?{
????????????long?endTime?=?System.currentTimeMillis();
????????????LogObject?logObject?=?new?LogObject();
????????????logObject.setEventName(methodName)
?????????????????????.setMsg(msg)
?????????????????????.setTraceId(traceId)
?????????????????????.setUserId(backendId)
?????????????????????.setRequest(request)
?????????????????????.setResponse(response)
?????????????????????.setCostTime((endTime?-?beginTime));
????????????LOGGER.info(JSON.toJSONString(logObject));
????????}
????????return?response;
????}
這種方案呢,有個(gè)缺點(diǎn),就是每個(gè)業(yè)務(wù)方法都得處理日志,更好的方案是使用aop加thread local的方式,將請(qǐng)求統(tǒng)一攔截且將返回值和請(qǐng)求參數(shù)串起來,這個(gè)網(wǎng)絡(luò)上的方案很多,這里就不闡述了。
對(duì)于對(duì)性能要求比較高的應(yīng)用,反而推薦第一種方案,因?yàn)槭褂?code style="font-size: 14px;padding: 2px 4px;border-radius: 4px;margin-right: 2px;margin-left: 2px;background-color: rgba(27, 31, 35, 0.05);font-family: "Operator Mono", Consolas, Monaco, Menlo, monospace;word-break: break-all;color: rgb(150, 84, 181);">aop,有一些性能損耗。像我之前在唯品會(huì)參與的商品聚合服務(wù),用的就是第一種方案,畢竟每一秒要處理上百萬的請(qǐng)求。
日志里加入traceId
如果應(yīng)用中已經(jīng)使用了統(tǒng)一調(diào)用鏈監(jiān)控方案,且能根據(jù)調(diào)用鏈id查詢接口情況的,可以不用在代碼里手動(dòng)加入traceId。如果應(yīng)用還沒接入調(diào)用鏈系統(tǒng),建議加一下traceId,尤其是針對(duì)聚合服務(wù),需要調(diào)用中臺(tái)各種微服務(wù)接口的。像聚合層下單業(yè)務(wù),需要調(diào)用的微服務(wù)就有如下這么些:
營銷系統(tǒng) 訂單系統(tǒng) 支付系統(tǒng)
下單業(yè)務(wù)調(diào)用這些接口的時(shí)候,如果沒有使用traceId進(jìn)行跟蹤的話,當(dāng)下單失敗的時(shí)候,到底是哪個(gè)微服務(wù)接口失敗了,就比較難找。下面以小程序端,調(diào)用聚合層下單接口的例子作為展示:
//營銷系統(tǒng)
{
???"eventName":"pms/getInfo",
???"traceId":"createOrder_1575270928956",
???"msg":"success",
???"costTime":2,
???"userId":1111111111,
???"request":{
??????"userId":1111111111,
??????"skuList":[
?????????{
????????????"skuId":2222,
????????????"skuPrice":65900,
????????????"buyNum":1,
????????????"activityType":0,
????????????"activityId":0,
?????????}
??????],
???},
???"response":{
??????"result":1,
??????"msg":"success",
??????"data":{
?????????"realPayFee":100,
??????}
???}
}
//訂單系統(tǒng)
{
???"eventName":"orderservice/createOrder",
???"traceId":"createOrder_1575270928956",
???"msg":"success",
???"costTime":29,
???"userId":null,
???"request":{
??????"skuList":[
?????????{
????????????"skuId":2222,
????????????"buyNum":1,
????????????"buyPrice":65900,
?????????}
??????],
???},
???"response":{
??????"result":"200",
??????"msg":"調(diào)用成功",
??????"data":{
?????????"bigOrderId":"BIG2019",
?????????"m2LOrderIds":{
????????????"MID2019":{
???????????????"88258135":"LIT2019"
????????????}
?????????}
??????}
???}
}
//支付系統(tǒng)
{
???"eventName":"payservice/pay",
???"traceId":"createOrder_1575270928956",
???"msg":"success",
???"costTime":301,
???"request":{
??????"orderId":"BIG2019",
??????"paySubject":"測試",
??????"totalFee":65900,
???},
???"response":{
??????"requestId":"test",
??????"code":0,
??????"message":"操作成功",
??????"data":{
?????????"payId":123,
?????????"orderId":"BIG2019",
?????????"tradeType":"JSAPI",
?????????"perpayId":"test",
?????????"nonceStr":"test",
?????????"appId":"test",
?????????"signType":"MD5",
?????????"sign":"test",
?????????"timeStamp":"1575270929"
??????}
???}
}
可以看到聚合層需要調(diào)用營銷、訂單和支付三個(gè)應(yīng)用的接口,調(diào)用的過程中,使用traceId為createOrder_1575270928956的串了起來,這樣我們只需要grep這個(gè)traceId就可以把所有相關(guān)的調(diào)用和上下文找出來。
traceId如何生成呢,一種簡單的做法是,使用System.currentTimeMillis() 加上業(yè)務(wù)接口名字,如:
?long?beginTime?=?System.currentTimeMillis();
?String?traceId?=?"createOrder_"+beginTime;
加traceId會(huì)侵入到業(yè)務(wù)方法里,比如說:
public?void?createOrder(Object?obj)?{
??long?beginTime?=?System.currentTimeMillis();
???String?traceId?=?"createOrder_"+beginTime;
???pmsService.getInfo(obj,traceId);
???orderService.createOrder(obj,traceId);
???payService.getPrepayId(obj,traceId);
}
像pmsService這些內(nèi)部的service方法,都需要加一個(gè)traceId字段,目前我覺得還好,要是覺得入侵了,也可以考慮thread local的方式,處理請(qǐng)求的時(shí)候,為當(dāng)前線程存儲(chǔ)一下traceId,然后在業(yè)務(wù)方法里,再從當(dāng)前線程里拿出來,避免接口方法里的traceId滿天飛。
關(guān)注公眾號(hào)【Java技術(shù)江湖】后回復(fù)“PDF”即可領(lǐng)取200+頁的《Java工程師面試指南》
強(qiáng)烈推薦,幾乎涵蓋所有Java工程師必知必會(huì)的知識(shí)點(diǎn),不管是復(fù)習(xí)還是面試,都很實(shí)用。


