對(duì) Prometheus 兼容的時(shí)序數(shù)據(jù)庫(kù)進(jìn)行壓力測(cè)試
為了在不同 VictoriaMetrics 版本之間或 VictoriaMetrics 與其他支持 Prometheus remote_write 協(xié)議的解決方案之間進(jìn)行比較,VictoriaMetrics 專(zhuān)門(mén)提供了一個(gè) Prometheus-benchmark 的項(xiàng)目。
實(shí)現(xiàn)原理
該項(xiàng)目的實(shí)現(xiàn)其實(shí)非常簡(jiǎn)單:
使用 node_exporter用作類(lèi)似生產(chǎn)環(huán)境指標(biāo)的來(lái)源在 node_exporter前面掛了一個(gè) nginx 用作緩存代理,當(dāng)抓取太多的指標(biāo)時(shí),這可以減少node_exporter的負(fù)載使用 vmagent來(lái)抓取node_exporter指標(biāo)并通過(guò) Prometheus remote_write 協(xié)議將它們轉(zhuǎn)發(fā)到配置的目標(biāo)。如果設(shè)置了多個(gè)目標(biāo),則多個(gè)vmagent實(shí)例獨(dú)立地將抓取的數(shù)據(jù)推送到這些目標(biāo)去

不過(guò)需要注意,該測(cè)試并不會(huì)從配置的 remote_write 目標(biāo)收集指標(biāo) ,它只會(huì)采集內(nèi)部組件 vmagent 和 vmalert 的指標(biāo),他會(huì)假設(shè)對(duì)測(cè)試的 Prometheus 存儲(chǔ)系統(tǒng)的監(jiān)控是單獨(dú)進(jìn)行的,比如下面我們以單節(jié)點(diǎn)的 VictoriaMetrics 來(lái)作為 remote_write 的目標(biāo),那么我們可以自行對(duì)其進(jìn)行監(jiān)控。
該項(xiàng)目的核心實(shí)現(xiàn)就是根據(jù)傳入的一系列參數(shù)不斷去更新抓取指標(biāo)的配置文件,然后 vmagent 根據(jù)該項(xiàng)目暴露的接口獲取其配置 -promscrape.config 去抓取指標(biāo),核心代碼如下所示:
package?main
import?(
?"flag"
?"fmt"
?"log"
?"math/rand"
?"net/http"
?"sync"
?"time"
?"gopkg.in/yaml.v2"
)
var?(
?listenAddr?????????????????=?flag.String("httpListenAddr",?":8436",?"TCP?address?for?incoming?HTTP?requests")
?targetsCount???????????????=?flag.Int("targetsCount",?100,?"The?number?of?scrape?targets?to?return?from?-httpListenAddr.?Each?target?has?the?same?address?defined?by?-targetAddr")
?targetAddr?????????????????=?flag.String("targetAddr",?"demo.robustperception.io:9090",?"Address?with?port?to?use?as?target?address?the?scrape?config?returned?from?-httpListenAddr")
?scrapeInterval?????????????=?flag.Duration("scrapeInterval",?time.Second*5,?"The?scrape_interval?to?set?at?the?scrape?config?returned?from?-httpListenAddr")
?scrapeConfigUpdateInterval?=?flag.Duration("scrapeConfigUpdateInterval",?time.Minute*10,?"The?-scrapeConfigUpdatePercent?scrape?targets?are?updated?in?the?scrape?config?returned?from?-httpListenAddr?every?-scrapeConfigUpdateInterval")
?scrapeConfigUpdatePercent??=?flag.Float64("scrapeConfigUpdatePercent",?1,?"The?-scrapeConfigUpdatePercent?scrape?targets?are?updated?in?the?scrape?config?returned?from?-httpListenAddr?ever?-scrapeConfigUpdateInterval")
)
func?main()?{
?flag.Parse()
?flag.VisitAll(func(f?*flag.Flag)?{
??log.Printf("-%s=%s",?f.Name,?f.Value)
?})
?c?:=?newConfig(*targetsCount,?*scrapeInterval,?*targetAddr)
?var?cLock?sync.Mutex
?p?:=?*scrapeConfigUpdatePercent?/?100
?r?:=?rand.New(rand.NewSource(time.Now().UnixNano()))
?go?func()?{
??rev?:=?0
??for?range?time.Tick(*scrapeConfigUpdateInterval)?{
???rev++
???revStr?:=?fmt.Sprintf("r%d",?rev)
???cLock.Lock()
???for?_,?sc?:=?range?c.ScrapeConfigs?{
????for?_,?stc?:=?range?sc.StaticConfigs?{
?????if?r.Float64()?>=?p?{
??????continue
?????}
?????stc.Labels["revision"]?=?revStr
????}
???}
???cLock.Unlock()
??}
?}()
?rh?:=?func(w?http.ResponseWriter,?r?*http.Request)?{
??cLock.Lock()
??data?:=?c.marshalYAML()
??cLock.Unlock()
??w.Header().Set("Content-Type",?"text/yaml")
??w.Write(data)
?}
?hf?:=?http.HandlerFunc(rh)
?log.Printf("starting?scrape?config?updater?at?http://%s/",?*listenAddr)
?if?err?:=?http.ListenAndServe(*listenAddr,?hf);?err?!=?nil?{
??log.Fatalf("unexpected?error?when?running?the?http?server:?%s",?err)
?}
}
func?(c?*config)?marshalYAML()?[]byte?{
?data,?err?:=?yaml.Marshal(c)
?if?err?!=?nil?{
??log.Fatalf("BUG:?unexpected?error?when?marshaling?config:?%s",?err)
?}
?return?data
}
func?newConfig(targetsCount?int,?scrapeInterval?time.Duration,?targetAddr?string)?*config?{
?scs?:=?make([]*staticConfig,?0,?targetsCount)
?for?i?:=?0;?i???scs?=?append(scs,?&staticConfig{
???Targets:?[]string{targetAddr},
???Labels:?map[string]string{
????"instance":?fmt.Sprintf("host-%d",?i),
????"revision":?"r0",
???},
??})
?}
?return?&config{
??Global:?globalConfig{
???ScrapeInterval:?scrapeInterval,
??},
??ScrapeConfigs:?[]*scrapeConfig{
???{
????JobName:???????"node_exporter",
????StaticConfigs:?scs,
???},
??},
?}
}
//?config?represents?essential?parts?from?Prometheus?config?defined?at?https://prometheus.io/docs/prometheus/latest/configuration/configuration/
type?config?struct?{
?Global????????globalConfig????`yaml:"global"`
?ScrapeConfigs?[]*scrapeConfig?`yaml:"scrape_configs"`
}
//?globalConfig?represents?essential?parts?for?`global`?section?of?Prometheus?config.
//
//?See?https://prometheus.io/docs/prometheus/latest/configuration/configuration/
type?globalConfig?struct?{
?ScrapeInterval?time.Duration?`yaml:"scrape_interval"`
}
//?rapeConfig?represents?essential?parts?for?`scrape_config`?section?of?Prometheus?config.
//
//?See?https://prometheus.io/docs/prometheus/latest/configuration/configuration/#scrape_config
type?scrapeConfig?struct?{
?JobName???????string??????????`yaml:"job_name"`
?StaticConfigs?[]*staticConfig?`yaml:"static_configs"`
}
//?staticConfig?represents?essential?parts?for?`static_config`?section?of?Prometheus?config.
//
//?See?https://prometheus.io/docs/prometheus/latest/configuration/configuration/#static_config
type?staticConfig?struct?{
?Targets?[]string??????????`yaml:"targets"`
?Labels??map[string]string?`yaml:"labels"`
}
我們可以通過(guò)配置下面的一些參數(shù)來(lái)控制壓力測(cè)試:
targetsCount
targetsCount 定義有多少 node_exporter 抓取目標(biāo)被添加到 vmagent 的抓取配置中,每個(gè)都包含一個(gè)唯一的 instance 標(biāo)簽,該參數(shù)會(huì)影響抓取指標(biāo)的數(shù)量和基數(shù)。通常,一個(gè) node_exporter 會(huì)產(chǎn)生大約 815 個(gè)唯一指標(biāo),因此當(dāng) targetsCount 設(shè)置為1000時(shí),測(cè)試會(huì)生成大約 815*100=815K 的活躍時(shí)間序列。
scrapeInterval
scrapeInterval 定義了抓取每個(gè)目標(biāo)的頻率,此參數(shù)會(huì)影響數(shù)據(jù)攝取率,間隔越小,數(shù)據(jù)攝取率越高。
遠(yuǎn)程存儲(chǔ)
remoteStorages 包含一個(gè)測(cè)試的系統(tǒng)列表,將抓取的指標(biāo)推送過(guò)去,如果設(shè)置了多個(gè)目標(biāo),則多個(gè) vmagent 實(shí)例會(huì)分別將相同的數(shù)據(jù)推送到多個(gè)目標(biāo)。
Churn rate
scrapeConfigUpdatePercent 和 scrapeConfigUpdateInterval 可用于生成非零的時(shí)間序列流失率,這在 Kubernetes 監(jiān)控中是非常典型的場(chǎng)景。
如何使用
一個(gè)典型的場(chǎng)景是可以運(yùn)行多個(gè) VictoriaMetrics,然后在 remoteStorages 部分列出它們的地址,該測(cè)試的默認(rèn)配置是 targetsCount=1000 和 scrapeInterval=10s 導(dǎo)致大約 80k 樣本/秒:
800?metrics-per-target?*?1k?targets?/?10s?=?80k?samples/s
然后我們可以通過(guò) VictoriaMetrics 官方的 Grafana dasbhoards 比較資源使用、數(shù)據(jù)壓縮和整體性能。
該項(xiàng)目默認(rèn)通過(guò) Helm Chart 的方式來(lái)進(jìn)行安裝,默認(rèn)會(huì)安裝一個(gè) single 模式的 VictoriaMetrics 實(shí)例,直接 Clone 項(xiàng)目:
$?git?clone?https://github.com/VictoriaMetrics/prometheus-benchmark
$?cd?prometheus-benchmark
然后可以根據(jù)自己的需求修改 chart/values.yaml 配置參數(shù),我這里的配置如下所示:
vmtag:?"v1.77.1"
#?targetsCount?defines?the?number?of?nodeexporter?instances?to?scrape.
#?This?option?allows?to?configure?the?number?of?active?time?series?to?push
#?to?remoteStorages.
#?Every?nodeexporter?exposes?around?815?unique?metrics,?so?when?targetsCount
#?is?set?to?1000,?then?the?benchmark?generates?around?815*100=815K?active?time?series.
targetsCount:?2000
#?scrapeInterval?defines?how?frequently?to?scrape?nodeexporter?targets.
#?This?option?allows?to?configure?data?ingestion?rate?per?every?remoteStorages.
#?For?example,?if?the?benchmark?generates?815K?active?time?series?and?scrapeInterval
#?is?set?to?10s,?then?the?data?ingestion?rate?equals?to?815K/10s?=?81.5K?samples/sec.
scrapeInterval:?10s
#?queryInterval?is?how?often?to?send?queries?from?files/alerts.yaml?to?remoteStorages.readURL
#?This?option?can?be?used?for?tuning?read?load?at?remoteStorages.
#?It?is?a?good?rule?of?thumb?to?keep?it?in?sync?with?scrapeInterval.
queryInterval:?10s
#?scrapeConfigUpdatePercent?is?the?percent?of?nodeexporter?targets
#?which?are?updated?with?unique?label?on?every?scrape?config?update
#?(see?scrapeConfigUpdateInterval).
#?This?option?allows?tuning?time?series?churn?rate.
#?For?example,?if?scrapeConfigUpdatePercent?is?set?to?1?for?targetsCount=1000,
#?then?around?10?targets?gets?updated?labels?on?every?scrape?config?update.
#?This?generates?around?815*10=8150?new?time?series?every?scrapeConfigUpdateInterval.
scrapeConfigUpdatePercent:?1
#?scrapeConfigUpdateInterval?specifies?how?frequently?to?update?labels
#?across?scrapeConfigUpdatePercent?nodeexporter?targets.
#?This?option?allows?tuning?time?series?churn?rate.
#?For?example,?if?scrapeConfigUpdateInterval?is?set?to?10m?for?targetsCount=1000
#?and?scrapeConfigUpdatePercent=1,?then?around?10?targets?gets?updated?labels?every?10?minutes.
#?This?generates?around?815*10=8150?new?time?series?every?10?minutes.
scrapeConfigUpdateInterval:?10m
#?writeConcurrenty?is?the?number?of?concurrent?tcp?connections?to?use
#?for?sending?the?data?to?the?tested?remoteStorages.
#?Increase?this?value?if?there?is?a?high?network?latency?between?prometheus-benchmark
#?components?and?the?tested?remoteStorages.
writeConcurrency:?16
#?remoteStorages?contains?a?named?list?of?Prometheus-compatible?systems?to?test.
#?These?systems?must?support?data?ingestion?via?Prometheus?remote_write?protocol.
#?These?systems?must?also?support?Prometheus?querying?API?if?query?performance
#?needs?to?be?measured?additionally?to?data?ingestion?performance.
remoteStorages:
??vm-0:
????#?writeURL?表示用來(lái)接收?Prometheus?remote_write?協(xié)議的遠(yuǎn)程存儲(chǔ)
????#?例如:
????#?-?單節(jié)點(diǎn)VictoriaMetrics模式:http://:8428/api/v1/write for single-node
????#?-?集群VictoriaMetrics模式:http://:8480/insert/0/prometheus/api/v1/write
????writeURL:?"http://my-bench-prometheus-benchmark-vmsingle.default.svc.cluster.local:8428/api/v1/write"
????
????#?readURL?是可選的,如果需要測(cè)試測(cè)試的性能則需要配置,通過(guò)發(fā)送報(bào)警規(guī)則(files/alerts.yaml)給?readURL?進(jìn)行測(cè)試
????#?例如:
????#?-?單節(jié)點(diǎn)VictoriaMetrics模式:http://:8428/
????#?-?集群VictoriaMetrics模式:http://:8481/select/0/prometheus/
????readURL:?"http://my-bench-prometheus-benchmark-vmsingle.default.svc.cluster.local:8428/"
????writeBearerToken:?""
????readBearerToken:?""
??#?vm-1:??#?如果有多個(gè)遠(yuǎn)程存儲(chǔ)系統(tǒng)可以繼續(xù)配置即可
??#???writeURL:?"http://victoria-metrics-victoria-metrics-cluster-vminsert.default.svc.cluster.local:8480/insert/1/prometheus/api/v1/write"
??#???readURL:?"http://victoria-metrics-victoria-metrics-cluster-vmselect.default.svc.cluster.local:8481/select/1/prometheus/"
現(xiàn)在項(xiàng)目代碼中默認(rèn)帶一個(gè)單節(jié)點(diǎn)的 VictoriaMetrics,但是 Charts 模板中沒(méi)有添加 Service 對(duì)象,操作不是很方便,我們添加一個(gè) chart/templates/vmsingle/service.yaml 文件,內(nèi)容如下所示:
apiVersion:?v1
kind:?Service
metadata:
??name:?{{?include?"prometheus-benchmark.fullname"?.?}}-vmsingle
??namespace:?{{?.Release.Namespace?}}
??labels:
????{{-?include?"prometheus-benchmark.labels"?.?|?nindent?4?}}
spec:
??type:?ClusterIP
??selector:
????job:?vmsingle
????{{-?include?"prometheus-benchmark.selectorLabels"?.?|?nindent?4?}}
??ports:
??-?port:?8428
????targetPort:?8428
配置完過(guò)后在項(xiàng)目根目錄下面執(zhí)行下面的命令即可開(kāi)始測(cè)試:
$?make?install
上面的命令會(huì)使用 Helm 進(jìn)行安裝:
$?kubectl?get?pods?????????
NAME??????????????????????????????????????????????????????????READY???STATUS????RESTARTS???AGE
grafana-db468ccf9-mtn87???????????????????????????????????????1/1?????Running???0??????????90m
my-bench-prometheus-benchmark-nodeexporter-76c497cf59-m5k66???2/2?????Running???0??????????49m
my-bench-prometheus-benchmark-vmagent-vm-0-6bcbbb5fd8-8rhcx???2/2?????Running???0??????????49m
my-bench-prometheus-benchmark-vmalert-vm-0-6f6b565ccc-snsk5???2/2?????Running???0??????????49m
my-bench-prometheus-benchmark-vmsingle-585579fbf5-cpzhg???????1/1?????Running???0??????????49m
$?kubectl?get?svc?
NAME?????????????????????????????????????????TYPE????????CLUSTER-IP??????EXTERNAL-IP???PORT(S)????AGE
my-bench-prometheus-benchmark-nodeexporter???ClusterIP???10.96.156.144???????????9102/TCP???50m
my-bench-prometheus-benchmark-vmsingle???????ClusterIP???10.96.75.242????????????8428/TCP???50m
其中的 node-exporter 應(yīng)用包含兩個(gè)容器,一個(gè)是應(yīng)用本身,另外一個(gè)是 nginx 包裝了一層用于緩解 node-exporter 的壓力。然后我們可以給 Grafana 配置監(jiān)控?cái)?shù)據(jù)源 http://my-bench-prometheus-benchmark-vmsingle:8428,導(dǎo)入幾個(gè) VictorialMetrics 官方提供的 Grafana Dashboard:https://grafana.com/orgs/victoriametrics。
下圖為 vmagent 的監(jiān)控圖表,從圖表上可以看出現(xiàn)在抓取的指標(biāo)目標(biāo)為2000,和上面配置一致,而且沒(méi)人出現(xiàn)任何錯(cuò)誤,證明現(xiàn)在 vmagent 還在壓力范圍內(nèi)。
同樣我們也可以查看下 VictoriaMetrics 這個(gè)單節(jié)點(diǎn)的遠(yuǎn)程存儲(chǔ)的監(jiān)控圖表,可以看到目前也可以正常運(yùn)行,活躍的時(shí)間序列也達(dá)到了一百多萬(wàn)。
除了可以從上面的監(jiān)控圖表來(lái)查看壓力測(cè)試的結(jié)果之外,我們還可以去通過(guò)下面的一些指標(biāo)來(lái)驗(yàn)證是否達(dá)到了壓力位了:數(shù)據(jù)攝取率:
sum(rate(vm_promscrape_scraped_samples_sum{job="vmagent"})) by (remote_storage_name)
將數(shù)據(jù)包發(fā)送到配置的遠(yuǎn)程存儲(chǔ)時(shí)丟棄的數(shù)據(jù)包數(shù)。如果該值大于零,則遠(yuǎn)程存儲(chǔ)拒絕接受傳入的數(shù)據(jù)。在這種情況下,建議檢查遠(yuǎn)程存儲(chǔ)日志和 vmagent 日志。
sum(rate(vmagent_remotewrite_packets_dropped_total{job="vmagent"})) by (remote_storage_name)
將數(shù)據(jù)發(fā)送到遠(yuǎn)程存儲(chǔ)時(shí)的重試次數(shù)。如果該值大于零,那么這表明遠(yuǎn)程存儲(chǔ)無(wú)法處理工作負(fù)載。在這種情況下,建議檢查遠(yuǎn)程存儲(chǔ)日志和 vmagent 日志。
sum(rate(vmagent_remotewrite_retries_count_total{job="vmagent"})) by (remote_storage_name)
vmagent 端尚未發(fā)送到遠(yuǎn)程存儲(chǔ)的掛起數(shù)據(jù)量。如果圖形增長(zhǎng),則遠(yuǎn)程存儲(chǔ)無(wú)法跟上給定的數(shù)據(jù)攝取率??梢試L試在 chart/values.yaml中增加writeConcurrency,如果 prometheus benchmark 的 vmagent 之間存在較高的網(wǎng)絡(luò)延遲,則可能會(huì)有所幫助。
sum(vm_persistentqueue_bytes_pending{job="vmagent"}) by (remote_storage_name)
從 chart/files/alerts.yaml執(zhí)行查詢(xún)時(shí)的錯(cuò)誤數(shù)。如果該值大于零,則遠(yuǎn)程存儲(chǔ)無(wú)法處理查詢(xún)工作負(fù)載。在這種情況下,建議檢查遠(yuǎn)程存儲(chǔ)日志和 vmalert 日志。
sum(rate(vmalert_execution_errors_total{job="vmalert"})) by (remote_storage_name)
這些指標(biāo)我們可以通過(guò)執(zhí)行 make monitor 命令來(lái)進(jìn)行查詢(xún):
$?make?monitor
kubectl?port-forward?`kubectl?-n?default?get?pods?-n?default?-l?'job=vmsingle,chart-name=my-bench-prometheus-benchmark'?-o?name`?8428
Forwarding?from?127.0.0.1:8428?->?8428
Forwarding?from?[::1]:8428?->?8428
然后可以在瀏覽器中訪(fǎng)問(wèn) http://127.0.0.1:8428/vmui 來(lái)驗(yàn)證上面的指標(biāo),如下所示:
從我們這里的測(cè)試來(lái)看 2000 個(gè)抓取目標(biāo)并沒(méi)有達(dá)到上限,所以我們還可以繼續(xù)增加該值進(jìn)行測(cè)試,比如增加到 2500,如果各個(gè)組件都還可以正常運(yùn)行,那么再增加到 3000 繼續(xù)測(cè)試:$?make?install
比如我這里測(cè)試 4000 個(gè)抓取目錄,也就是 800 metrics-per-target * 4k targets / 10s = 320k samples/s 仍然可以正常運(yùn)行,看來(lái) VictoriaMetrics 官方說(shuō)的單節(jié)點(diǎn)模式可以支持每秒一百萬(wàn)樣本的數(shù)據(jù)應(yīng)該還是相對(duì)靠譜的。

使用下面的命令即可結(jié)束測(cè)試:
make?delete
總結(jié)
在通過(guò)對(duì) Prometheus remote_write 協(xié)議接受數(shù)據(jù)的不同解決方案或同一解決方案的不同版本進(jìn)行壓力測(cè)試比較時(shí),我們應(yīng)該會(huì)得出一個(gè)大致的結(jié)論了。例如,Prometheus 本身、Cortex、Thanos、M3DB 和 TimescaleDB 等方案的性能表現(xiàn)如何,但是,我們始終建議不要簡(jiǎn)單地相信這些基準(zhǔn)測(cè)試,而是要驗(yàn)證生產(chǎn)數(shù)據(jù)的數(shù)量和資源使用情況。
