微服務(wù)項(xiàng)目自從用了SkyWalking監(jiān)控系統(tǒng),睡覺也香了~
原文鏈接:https://cnblogs.com/cjsblog/p/14075486.html
SkyWalking 是一個(gè)應(yīng)用性能監(jiān)控系統(tǒng),特別為微服務(wù)、云原生和基于容器(Docker, Kubernetes, Mesos)體系結(jié)構(gòu)而設(shè)計(jì)。除了應(yīng)用指標(biāo)監(jiān)控以外,它還能對分布式調(diào)用鏈路進(jìn)行追蹤。類似功能的組件還有:Zipkin、Pinpoint、CAT等。
上幾張圖,看看效果,然后再一步一步搭建并使用




1. 概念與架構(gòu)
SkyWalking是一個(gè)開源監(jiān)控平臺(tái),用于從服務(wù)和云原生基礎(chǔ)設(shè)施收集、分析、聚合和可視化數(shù)據(jù)。SkyWalking提供了一種簡單的方法來維護(hù)分布式系統(tǒng)的清晰視圖,甚至可以跨云查看。它是一種現(xiàn)代APM,專門為云原生、基于容器的分布式系統(tǒng)設(shè)計(jì)。
SkyWalking從三個(gè)維度對應(yīng)用進(jìn)行監(jiān)視:service(服務(wù)), service instance(實(shí)例), endpoint(端點(diǎn))
服務(wù)和實(shí)例就不多說了,端點(diǎn)是服務(wù)中的某個(gè)路徑或者說URI
SkyWalking allows users to understand the topology relationship between Services and Endpoints, to view the metrics of every Service/Service Instance/Endpoint and to set alarm rules.
SkyWalking允許用戶了解服務(wù)和端點(diǎn)之間的拓?fù)潢P(guān)系,查看每個(gè)服務(wù)/服務(wù)實(shí)例/端點(diǎn)的度量,并設(shè)置警報(bào)規(guī)則。
1.1. 架構(gòu)
SkyWalking邏輯上分為四個(gè)部分:Probes(探針), Platform backend(平臺(tái)后端), Storage(存儲(chǔ)), UI
這個(gè)結(jié)構(gòu)就很清晰了,探針就是Agent負(fù)責(zé)采集數(shù)據(jù)并上報(bào)給服務(wù)端,服務(wù)端對數(shù)據(jù)進(jìn)行處理和存儲(chǔ),UI負(fù)責(zé)展示

2. 下載與安裝
SkyWalking有兩中版本,ES版本和非ES版。如果我們決定采用ElasticSearch作為存儲(chǔ),那么就下載es版本。
https://skywalking.apache.org/downloads/
https://archive.apache.org/dist/skywalking/


agent目錄將來要拷貝到各服務(wù)所在機(jī)器上用作探針
bin目錄是服務(wù)啟動(dòng)腳本
config目錄是配置文件
oap-libs目錄是oap服務(wù)運(yùn)行所需的jar包
webapp目錄是web服務(wù)運(yùn)行所需的jar包
接下來,要選擇存儲(chǔ)了,支持的存儲(chǔ)有:
H2
ElasticSearch 6, 7
MySQL
TiDB
InfluxDB
作為監(jiān)控系統(tǒng),首先排除H2和MySQL,這里推薦InfluxDB,它本身就是時(shí)序數(shù)據(jù)庫,非常適合這種場景
但是InfluxDB我不是很熟悉,所以這里先用ElasticSearch7
https://github.com/apache/skywalking/blob/master/docs/en/setup/backend/backend-storage.md
2.1. 安裝ElasticSearch
https://www.elastic.co/guide/en/elasticsearch/reference/7.10/targz.html
# 啟動(dòng)-d -p pid# 停止pkill -F pid

ElasticSearch7.x需要Java 11以上的版本,但是如果你設(shè)置了環(huán)境變量JAVA_HOME的話,它會(huì)用你自己的Java版本
通常,啟動(dòng)過程中會(huì)報(bào)以下三個(gè)錯(cuò)誤:
[]: max file descriptors [4096] for elasticsearch process is too low, increase to at least [65535][]: max virtual memory areas vm.max_map_count [65530] is too low, increase to at least [262144][]: the default discovery settings are unsuitable for production use; at least one of [discovery.seed_hosts, discovery.seed_providers, cluster.initial_master_nodes] must be configured
解決方法:
在 /etc/security/limits.conf 文件中追加以下內(nèi)容:
* soft nofile 65536* hard nofile 65536* soft nproc 4096* hard nproc 4096
可通過以下四個(gè)命令查看修改結(jié)果:
ulimit -Hnulimit -Snulimit -Huulimit -Su
修改 /etc/sysctl.conf 文件,追加以下內(nèi)容:
vm.max_map_count=262144修改es配置文件 elasticsearch.yml 取消注釋,保留一個(gè)節(jié)點(diǎn)
cluster.initial_master_nodes: ["node-1"]為了能夠ip:port方式訪問,還需修改網(wǎng)絡(luò)配置
network.host: 0.0.0.0修改完是這樣的:


至此,ElasticSearch算是啟動(dòng)成功了
一個(gè)節(jié)點(diǎn)還不夠,這里用三個(gè)節(jié)點(diǎn)搭建一個(gè)集群
192.168.100.14 config/elasticsearch.yml
cluster.name: my-monitornode.name: node-1network.host: 192.168.100.14http.port: 9200discovery.seed_hosts: ["192.168.100.14:9300", "192.168.100.15:9300", "192.168.100.19:9300"]cluster.initial_master_nodes: ["node-1"]
192.168.100.15 config/elasticsearch.yml
cluster.name: my-monitornode.name: node-2network.host: 192.168.100.15http.port: 9200discovery.seed_hosts: ["192.168.100.14:9300", "192.168.100.15:9300", "192.168.100.19:9300"]cluster.initial_master_nodes: ["node-1"]
192.168.100.19 config/elasticsearch.yml
cluster.name: my-monitornode.name: node-3network.host: 192.168.100.19http.port: 9200discovery.seed_hosts: ["192.168.100.14:9300", "192.168.100.15:9300", "192.168.100.19:9300"]cluster.initial_master_nodes: ["node-1"]
同時(shí),建議修改三個(gè)節(jié)點(diǎn)config/jvm.options
-Xms2g-Xmx2g
依次啟動(dòng)三個(gè)節(jié)點(diǎn)
pkill -F pid./bin/elasticsearch -d -p pid



接下來,修改skywalking下config/application.yml 中配置es地址即可
storage:selector: ${SW_STORAGE:elasticsearch7}elasticsearch7:nameSpace: ${SW_NAMESPACE:""}clusterNodes: ${SW_STORAGE_ES_CLUSTER_NODES:192.168.100.14:9200,192.168.100.15:9200,192.168.100.19:9200}
2.2. 安裝Agent
https://github.com/apache/skywalking/blob/v8.2.0/docs/en/setup/service-agent/java-agent/README.md
將agent目錄拷貝至各服務(wù)所在的機(jī)器上
scp -r ./agent chengjs@192.168.100.12:~/這里,我將它拷貝至各個(gè)服務(wù)目錄下

plugins是探針用到各種插件,SkyWalking插件都是即插即用的,可以把optional-plugins中的插件放到plugins中
修改 agent/config/agent.config 配置文件,也可以通過命令行參數(shù)指定
主要是配置服務(wù)名稱和后端服務(wù)地址
agent.service_name=${SW_AGENT_NAME:user-center}collector.backend_service=${SW_AGENT_COLLECTOR_BACKEND_SERVICES:192.168.100.17:11800}
當(dāng)然,也可以通過環(huán)境變量或系統(tǒng)屬性的方式來設(shè)置,例如:
export SW_AGENT_COLLECTOR_BACKEND_SERVICES=127.0.0.1:11800最后,在服務(wù)啟動(dòng)的時(shí)候用命令行參數(shù) -javaagent 來指定探針
java -javaagent:/path/to/skywalking-agent/skywalking-agent.jar -jar yourApp.jar例如:
java -javaagent:./agent/skywalking-agent.jar -Dspring.profiles.active=dev -Xms512m -Xmx1024m -jar demo-0.0.1-SNAPSHOT.jar3. 啟動(dòng)服務(wù)
修改 webapp/webapp.yml 文件,更改端口號(hào)及后端服務(wù)地址
server:port: 9000collector:path: /graphqlribbon:ReadTimeout: 10000# Point to all backend's restHost:restPort, split by ,listOfServers: 127.0.0.1:12800
啟動(dòng)服務(wù)
bin/startup.sh或者分別依次啟動(dòng)
bin/oapService.shbin/webappService.sh
查看logs目錄下的日志文件,看是否啟動(dòng)成功
瀏覽器訪問 http://127.0.0.1:9000
4. 告警
編輯 alarm-settings.yml 設(shè)置告警規(guī)則和通知
https://github.com/apache/skywalking/blob/v8.2.0/docs/en/setup/backend/backend-alarm.md
重點(diǎn)說下告警通知

為了使用釘釘機(jī)器人通知,接下來,新建一個(gè)項(xiàng)目
<project xmlns="http://maven.apache.org/POM/4.0.0" xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance"xsi:schemaLocation="http://maven.apache.org/POM/4.0.0 https://maven.apache.org/xsd/maven-4.0.0.xsd"><modelVersion>4.0.0</modelVersion><parent><groupId>org.springframework.boot</groupId><artifactId>spring-boot-starter-parent</artifactId><version>2.4.0</version><relativePath/> <!-- lookup parent from repository --></parent><groupId>com.wt.monitor</groupId><artifactId>skywalking-alarm</artifactId><version>1.0.0-SNAPSHOT</version><name>skywalking-alarm</name><properties><java.version>1.8</java.version></properties><dependencies><dependency><groupId>org.springframework.boot</groupId><artifactId>spring-boot-starter-web</artifactId></dependency><dependency><groupId>com.aliyun</groupId><artifactId>alibaba-dingtalk-service-sdk</artifactId><version>1.0.1</version></dependency><dependency><groupId>commons-codec</groupId><artifactId>commons-codec</artifactId><version>1.15</version></dependency><dependency><groupId>com.alibaba</groupId><artifactId>fastjson</artifactId><version>1.2.75</version></dependency><dependency><groupId>org.projectlombok</groupId><artifactId>lombok</artifactId><optional>true</optional></dependency></dependencies><build><plugins><plugin><groupId>org.springframework.boot</groupId><artifactId>spring-boot-maven-plugin</artifactId></plugin></plugins></build></project>
可選依賴(不建議引入)
<dependency<groupId>org.apache.skywalking</groupId><artifactId>server-core</artifactId><version>8.2.0</version></dependency>
定義告警消息實(shí)體類
package com.wt.monitor.skywalking.alarm.domain;import lombok.Data;import java.io.Serializable;/*** @author ChengJianSheng* @date 2020/12/1*/public class AlarmMessageDTO implements Serializable {private int scopeId;private String scope;/*** Target scope entity name*/private String name;private String id0;private String id1;private String ruleName;/*** Alarm text message*/private String alarmMessage;/*** Alarm time measured in milliseconds*/private long startTime;}
發(fā)送釘釘機(jī)器人消息
package com.wt.monitor.skywalking.alarm.service;import com.dingtalk.api.DefaultDingTalkClient;import com.dingtalk.api.DingTalkClient;import com.dingtalk.api.request.OapiRobotSendRequest;import com.taobao.api.ApiException;import lombok.extern.slf4j.Slf4j;import org.apache.commons.codec.binary.Base64;import org.springframework.beans.factory.annotation.Value;import org.springframework.stereotype.Service;import javax.crypto.Mac;import javax.crypto.spec.SecretKeySpec;import java.io.UnsupportedEncodingException;import java.net.URLEncoder;import java.security.InvalidKeyException;import java.security.NoSuchAlgorithmException;/*** https://ding-doc.dingtalk.com/doc#/serverapi2/qf2nxq* @author ChengJianSheng* @data 2020/12/1*/public class DingTalkAlarmService {("${dingtalk.webhook}")private String webhook;("${dingtalk.secret}")private String secret;public void sendMessage(String content) {try {Long timestamp = System.currentTimeMillis();String stringToSign = timestamp + "\n" + secret;Mac mac = Mac.getInstance("HmacSHA256");mac.init(new SecretKeySpec(secret.getBytes("UTF-8"), "HmacSHA256"));byte[] signData = mac.doFinal(stringToSign.getBytes("UTF-8"));String sign = URLEncoder.encode(new String(Base64.encodeBase64(signData)),"UTF-8");String serverUrl = webhook + "×tamp=" + timestamp + "&sign=" + sign;DingTalkClient client = new DefaultDingTalkClient(serverUrl);OapiRobotSendRequest request = new OapiRobotSendRequest();request.setMsgtype("text");OapiRobotSendRequest.Text text = new OapiRobotSendRequest.Text();text.setContent(content);request.setText(text);client.execute(request);} catch (ApiException e) {e.printStackTrace();log.error(e.getMessage(), e);} catch (NoSuchAlgorithmException e) {e.printStackTrace();log.error(e.getMessage(), e);} catch (UnsupportedEncodingException e) {e.printStackTrace();log.error(e.getMessage(), e);} catch (InvalidKeyException e) {e.printStackTrace();log.error(e.getMessage(), e);}}}
AlarmController.java
package com.wt.monitor.skywalking.alarm.controller;import com.alibaba.fastjson.JSON;import com.wt.monitor.skywalking.alarm.domain.AlarmMessageDTO;import com.wt.monitor.skywalking.alarm.service.DingTalkAlarmService;import lombok.extern.slf4j.Slf4j;import org.springframework.beans.factory.annotation.Autowired;import org.springframework.web.bind.annotation.PostMapping;import org.springframework.web.bind.annotation.RequestBody;import org.springframework.web.bind.annotation.RequestMapping;import org.springframework.web.bind.annotation.RestController;import java.text.MessageFormat;import java.util.List;/*** @author ChengJianSheng* @date 2020/12/1*/public class AlarmController {private DingTalkAlarmService dingTalkAlarmService;public void alarm( List<AlarmMessageDTO> alarmMessageDTOList) {log.info("收到告警信息: {}", JSON.toJSONString(alarmMessageDTOList));if (null != alarmMessageDTOList) {alarmMessageDTOList.forEach(e->dingTalkAlarmService.sendMessage(MessageFormat.format("-----來自SkyWalking的告警-----\n【名稱】: {0}\n【消息】: {1}\n", e.getName(), e.getAlarmMessage())));}}}

- END -
公眾號(hào)后臺(tái)回復(fù)「加群」加入一線高級工程師技術(shù)交流群,一起交流進(jìn)步。
推薦閱讀 Kubernetes 運(yùn)維架構(gòu)師實(shí)戰(zhàn)集訓(xùn)營 31天拿下K8s含金量最高的CKA證書! 一個(gè)完整的、全面 k8s 化的集群穩(wěn)定架構(gòu)(值得借鑒) JAVA應(yīng)用運(yùn)維,線上故障排查全套路 K8s常見問題:Service 不能訪問排查流程 Docker環(huán)境部署Prometheus+Grafana監(jiān)控系統(tǒng) Kubernetes 集群網(wǎng)絡(luò)從懵圈到熟悉 超詳細(xì)!Kubernetes 學(xué)習(xí)筆記總結(jié)
點(diǎn)亮,服務(wù)器三年不宕機(jī)


