<kbd id="afajh"><form id="afajh"></form></kbd>
<strong id="afajh"><dl id="afajh"></dl></strong>
    <del id="afajh"><form id="afajh"></form></del>
        1. <th id="afajh"><progress id="afajh"></progress></th>
          <b id="afajh"><abbr id="afajh"></abbr></b>
          <th id="afajh"><progress id="afajh"></progress></th>

          對(duì) Prometheus 兼容的時(shí)序數(shù)據(jù)庫(kù)進(jìn)行壓力測(cè)試

          共 3717字,需瀏覽 8分鐘

           ·

          2022-05-24 14:50

          為了在不同 VictoriaMetrics 版本之間或 VictoriaMetrics 與其他支持 Prometheus remote_write 協(xié)議的解決方案之間進(jìn)行比較,VictoriaMetrics 專(zhuān)門(mén)提供了一個(gè) Prometheus-benchmark 的項(xiàng)目。

          實(shí)現(xiàn)原理

          該項(xiàng)目的實(shí)現(xiàn)其實(shí)非常簡(jiǎn)單:

          • 使用 node_exporter 用作類(lèi)似生產(chǎn)環(huán)境指標(biāo)的來(lái)源
          • node_exporter 前面掛了一個(gè) nginx 用作緩存代理,當(dāng)抓取太多的指標(biāo)時(shí),這可以減少 node_exporter 的負(fù)載
          • 使用 vmagent 來(lái)抓取 node_exporter 指標(biāo)并通過(guò) Prometheus remote_write 協(xié)議將它們轉(zhuǎn)發(fā)到配置的目標(biāo)。如果設(shè)置了多個(gè)目標(biāo),則多個(gè) vmagent 實(shí)例獨(dú)立地將抓取的數(shù)據(jù)推送到這些目標(biāo)去

          不過(guò)需要注意,該測(cè)試并不會(huì)從配置的 remote_write 目標(biāo)收集指標(biāo) ,它只會(huì)采集內(nèi)部組件 vmagentvmalert 的指標(biāo),他會(huì)假設(shè)對(duì)測(cè)試的 Prometheus 存儲(chǔ)系統(tǒng)的監(jiān)控是單獨(dú)進(jìn)行的,比如下面我們以單節(jié)點(diǎn)的 VictoriaMetrics 來(lái)作為 remote_write 的目標(biāo),那么我們可以自行對(duì)其進(jìn)行監(jiān)控。

          該項(xiàng)目的核心實(shí)現(xiàn)就是根據(jù)傳入的一系列參數(shù)不斷去更新抓取指標(biāo)的配置文件,然后 vmagent 根據(jù)該項(xiàng)目暴露的接口獲取其配置 -promscrape.config 去抓取指標(biāo),核心代碼如下所示:

          package?main

          import?(
          ?"flag"
          ?"fmt"
          ?"log"
          ?"math/rand"
          ?"net/http"
          ?"sync"
          ?"time"

          ?"gopkg.in/yaml.v2"
          )

          var?(
          ?listenAddr?????????????????=?flag.String("httpListenAddr",?":8436",?"TCP?address?for?incoming?HTTP?requests")
          ?targetsCount???????????????=?flag.Int("targetsCount",?100,?"The?number?of?scrape?targets?to?return?from?-httpListenAddr.?Each?target?has?the?same?address?defined?by?-targetAddr")
          ?targetAddr?????????????????=?flag.String("targetAddr",?"demo.robustperception.io:9090",?"Address?with?port?to?use?as?target?address?the?scrape?config?returned?from?-httpListenAddr")
          ?scrapeInterval?????????????=?flag.Duration("scrapeInterval",?time.Second*5,?"The?scrape_interval?to?set?at?the?scrape?config?returned?from?-httpListenAddr")
          ?scrapeConfigUpdateInterval?=?flag.Duration("scrapeConfigUpdateInterval",?time.Minute*10,?"The?-scrapeConfigUpdatePercent?scrape?targets?are?updated?in?the?scrape?config?returned?from?-httpListenAddr?every?-scrapeConfigUpdateInterval")
          ?scrapeConfigUpdatePercent??=?flag.Float64("scrapeConfigUpdatePercent",?1,?"The?-scrapeConfigUpdatePercent?scrape?targets?are?updated?in?the?scrape?config?returned?from?-httpListenAddr?ever?-scrapeConfigUpdateInterval")
          )

          func?main()?{
          ?flag.Parse()
          ?flag.VisitAll(func(f?*flag.Flag)?{
          ??log.Printf("-%s=%s",?f.Name,?f.Value)
          ?})
          ?c?:=?newConfig(*targetsCount,?*scrapeInterval,?*targetAddr)
          ?var?cLock?sync.Mutex
          ?p?:=?*scrapeConfigUpdatePercent?/?100
          ?r?:=?rand.New(rand.NewSource(time.Now().UnixNano()))
          ?go?func()?{
          ??rev?:=?0
          ??for?range?time.Tick(*scrapeConfigUpdateInterval)?{
          ???rev++
          ???revStr?:=?fmt.Sprintf("r%d",?rev)
          ???cLock.Lock()
          ???for?_,?sc?:=?range?c.ScrapeConfigs?{
          ????for?_,?stc?:=?range?sc.StaticConfigs?{
          ?????if?r.Float64()?>=?p?{
          ??????continue
          ?????}
          ?????stc.Labels["revision"]?=?revStr
          ????}
          ???}
          ???cLock.Unlock()
          ??}
          ?}()
          ?rh?:=?func(w?http.ResponseWriter,?r?*http.Request)?{
          ??cLock.Lock()
          ??data?:=?c.marshalYAML()
          ??cLock.Unlock()
          ??w.Header().Set("Content-Type",?"text/yaml")
          ??w.Write(data)
          ?}
          ?hf?:=?http.HandlerFunc(rh)
          ?log.Printf("starting?scrape?config?updater?at?http://%s/",?*listenAddr)
          ?if?err?:=?http.ListenAndServe(*listenAddr,?hf);?err?!=?nil?{
          ??log.Fatalf("unexpected?error?when?running?the?http?server:?%s",?err)
          ?}
          }

          func?(c?*config)?marshalYAML()?[]byte?{
          ?data,?err?:=?yaml.Marshal(c)
          ?if?err?!=?nil?{
          ??log.Fatalf("BUG:?unexpected?error?when?marshaling?config:?%s",?err)
          ?}
          ?return?data
          }

          func?newConfig(targetsCount?int,?scrapeInterval?time.Duration,?targetAddr?string)?*config?{
          ?scs?:=?make([]*staticConfig,?0,?targetsCount)
          ?for?i?:=?0;?i???scs?=?append(scs,?&staticConfig{
          ???Targets:?[]string{targetAddr},
          ???Labels:?map[string]string{
          ????"instance":?fmt.Sprintf("host-%d",?i),
          ????"revision":?"r0",
          ???},
          ??})
          ?}
          ?return?&config{
          ??Global:?globalConfig{
          ???ScrapeInterval:?scrapeInterval,
          ??},
          ??ScrapeConfigs:?[]*scrapeConfig{
          ???{
          ????JobName:???????"node_exporter",
          ????StaticConfigs:?scs,
          ???},
          ??},
          ?}
          }

          //?config?represents?essential?parts?from?Prometheus?config?defined?at?https://prometheus.io/docs/prometheus/latest/configuration/configuration/
          type?config?struct?{
          ?Global????????globalConfig????`yaml:"global"`
          ?ScrapeConfigs?[]*scrapeConfig?`yaml:"scrape_configs"`
          }

          //?globalConfig?represents?essential?parts?for?`global`?section?of?Prometheus?config.
          //
          //?See?https://prometheus.io/docs/prometheus/latest/configuration/configuration/
          type?globalConfig?struct?{
          ?ScrapeInterval?time.Duration?`yaml:"scrape_interval"`
          }

          //?rapeConfig?represents?essential?parts?for?`scrape_config`?section?of?Prometheus?config.
          //
          //?See?https://prometheus.io/docs/prometheus/latest/configuration/configuration/#scrape_config
          type?scrapeConfig?struct?{
          ?JobName???????string??????????`yaml:"job_name"`
          ?StaticConfigs?[]*staticConfig?`yaml:"static_configs"`
          }

          //?staticConfig?represents?essential?parts?for?`static_config`?section?of?Prometheus?config.
          //
          //?See?https://prometheus.io/docs/prometheus/latest/configuration/configuration/#static_config
          type?staticConfig?struct?{
          ?Targets?[]string??????????`yaml:"targets"`
          ?Labels??map[string]string?`yaml:"labels"`
          }

          我們可以通過(guò)配置下面的一些參數(shù)來(lái)控制壓力測(cè)試:

          targetsCount

          targetsCount 定義有多少 node_exporter 抓取目標(biāo)被添加到 vmagent 的抓取配置中,每個(gè)都包含一個(gè)唯一的 instance 標(biāo)簽,該參數(shù)會(huì)影響抓取指標(biāo)的數(shù)量和基數(shù)。通常,一個(gè) node_exporter 會(huì)產(chǎn)生大約 815 個(gè)唯一指標(biāo),因此當(dāng) targetsCount 設(shè)置為1000時(shí),測(cè)試會(huì)生成大約 815*100=815K 的活躍時(shí)間序列。

          scrapeInterval

          scrapeInterval 定義了抓取每個(gè)目標(biāo)的頻率,此參數(shù)會(huì)影響數(shù)據(jù)攝取率,間隔越小,數(shù)據(jù)攝取率越高。

          遠(yuǎn)程存儲(chǔ)

          remoteStorages 包含一個(gè)測(cè)試的系統(tǒng)列表,將抓取的指標(biāo)推送過(guò)去,如果設(shè)置了多個(gè)目標(biāo),則多個(gè) vmagent 實(shí)例會(huì)分別將相同的數(shù)據(jù)推送到多個(gè)目標(biāo)。

          Churn rate

          scrapeConfigUpdatePercentscrapeConfigUpdateInterval 可用于生成非零的時(shí)間序列流失率,這在 Kubernetes 監(jiān)控中是非常典型的場(chǎng)景。

          如何使用

          一個(gè)典型的場(chǎng)景是可以運(yùn)行多個(gè) VictoriaMetrics,然后在 remoteStorages 部分列出它們的地址,該測(cè)試的默認(rèn)配置是 targetsCount=1000scrapeInterval=10s 導(dǎo)致大約 80k 樣本/秒

          800?metrics-per-target?*?1k?targets?/?10s?=?80k?samples/s

          然后我們可以通過(guò) VictoriaMetrics 官方的 Grafana dasbhoards 比較資源使用、數(shù)據(jù)壓縮和整體性能。

          該項(xiàng)目默認(rèn)通過(guò) Helm Chart 的方式來(lái)進(jìn)行安裝,默認(rèn)會(huì)安裝一個(gè) single 模式的 VictoriaMetrics 實(shí)例,直接 Clone 項(xiàng)目:

          $?git?clone?https://github.com/VictoriaMetrics/prometheus-benchmark
          $?cd?prometheus-benchmark

          然后可以根據(jù)自己的需求修改 chart/values.yaml 配置參數(shù),我這里的配置如下所示:

          vmtag:?"v1.77.1"

          #?targetsCount?defines?the?number?of?nodeexporter?instances?to?scrape.
          #?This?option?allows?to?configure?the?number?of?active?time?series?to?push
          #?to?remoteStorages.
          #?Every?nodeexporter?exposes?around?815?unique?metrics,?so?when?targetsCount
          #?is?set?to?1000,?then?the?benchmark?generates?around?815*100=815K?active?time?series.
          targetsCount:?2000

          #?scrapeInterval?defines?how?frequently?to?scrape?nodeexporter?targets.
          #?This?option?allows?to?configure?data?ingestion?rate?per?every?remoteStorages.
          #?For?example,?if?the?benchmark?generates?815K?active?time?series?and?scrapeInterval
          #?is?set?to?10s,?then?the?data?ingestion?rate?equals?to?815K/10s?=?81.5K?samples/sec.
          scrapeInterval:?10s

          #?queryInterval?is?how?often?to?send?queries?from?files/alerts.yaml?to?remoteStorages.readURL
          #?This?option?can?be?used?for?tuning?read?load?at?remoteStorages.
          #?It?is?a?good?rule?of?thumb?to?keep?it?in?sync?with?scrapeInterval.
          queryInterval:?10s

          #?scrapeConfigUpdatePercent?is?the?percent?of?nodeexporter?targets
          #?which?are?updated?with?unique?label?on?every?scrape?config?update
          #?(see?scrapeConfigUpdateInterval).
          #?This?option?allows?tuning?time?series?churn?rate.
          #?For?example,?if?scrapeConfigUpdatePercent?is?set?to?1?for?targetsCount=1000,
          #?then?around?10?targets?gets?updated?labels?on?every?scrape?config?update.
          #?This?generates?around?815*10=8150?new?time?series?every?scrapeConfigUpdateInterval.
          scrapeConfigUpdatePercent:?1

          #?scrapeConfigUpdateInterval?specifies?how?frequently?to?update?labels
          #?across?scrapeConfigUpdatePercent?nodeexporter?targets.
          #?This?option?allows?tuning?time?series?churn?rate.
          #?For?example,?if?scrapeConfigUpdateInterval?is?set?to?10m?for?targetsCount=1000
          #?and?scrapeConfigUpdatePercent=1,?then?around?10?targets?gets?updated?labels?every?10?minutes.
          #?This?generates?around?815*10=8150?new?time?series?every?10?minutes.
          scrapeConfigUpdateInterval:?10m

          #?writeConcurrenty?is?the?number?of?concurrent?tcp?connections?to?use
          #?for?sending?the?data?to?the?tested?remoteStorages.
          #?Increase?this?value?if?there?is?a?high?network?latency?between?prometheus-benchmark
          #?components?and?the?tested?remoteStorages.
          writeConcurrency:?16

          #?remoteStorages?contains?a?named?list?of?Prometheus-compatible?systems?to?test.
          #?These?systems?must?support?data?ingestion?via?Prometheus?remote_write?protocol.
          #?These?systems?must?also?support?Prometheus?querying?API?if?query?performance
          #?needs?to?be?measured?additionally?to?data?ingestion?performance.
          remoteStorages:
          ??vm-0:
          ????#?writeURL?表示用來(lái)接收?Prometheus?remote_write?協(xié)議的遠(yuǎn)程存儲(chǔ)
          ????#?例如:
          ????#?-?單節(jié)點(diǎn)VictoriaMetrics模式:http://:8428/api/v1/write for single-node
          ????#?-?集群VictoriaMetrics模式:http://:8480/insert/0/prometheus/api/v1/write
          ????writeURL:?"http://my-bench-prometheus-benchmark-vmsingle.default.svc.cluster.local:8428/api/v1/write"
          ????
          ????#?readURL?是可選的,如果需要測(cè)試測(cè)試的性能則需要配置,通過(guò)發(fā)送報(bào)警規(guī)則(files/alerts.yaml)給?readURL?進(jìn)行測(cè)試
          ????#?例如:
          ????#?-?單節(jié)點(diǎn)VictoriaMetrics模式:http://:8428/
          ????#?-?集群VictoriaMetrics模式:http://:8481/select/0/prometheus/
          ????readURL:?"http://my-bench-prometheus-benchmark-vmsingle.default.svc.cluster.local:8428/"
          ????writeBearerToken:?""
          ????readBearerToken:?""
          ??#?vm-1:??#?如果有多個(gè)遠(yuǎn)程存儲(chǔ)系統(tǒng)可以繼續(xù)配置即可
          ??#???writeURL:?"http://victoria-metrics-victoria-metrics-cluster-vminsert.default.svc.cluster.local:8480/insert/1/prometheus/api/v1/write"
          ??#???readURL:?"http://victoria-metrics-victoria-metrics-cluster-vmselect.default.svc.cluster.local:8481/select/1/prometheus/"

          現(xiàn)在項(xiàng)目代碼中默認(rèn)帶一個(gè)單節(jié)點(diǎn)的 VictoriaMetrics,但是 Charts 模板中沒(méi)有添加 Service 對(duì)象,操作不是很方便,我們添加一個(gè) chart/templates/vmsingle/service.yaml 文件,內(nèi)容如下所示:

          apiVersion:?v1
          kind:?Service
          metadata:
          ??name:?{{?include?"prometheus-benchmark.fullname"?.?}}-vmsingle
          ??namespace:?{{?.Release.Namespace?}}
          ??labels:
          ????{{-?include?"prometheus-benchmark.labels"?.?|?nindent?4?}}
          spec:
          ??type:?ClusterIP
          ??selector:
          ????job:?vmsingle
          ????{{-?include?"prometheus-benchmark.selectorLabels"?.?|?nindent?4?}}
          ??ports:
          ??-?port:?8428
          ????targetPort:?8428

          配置完過(guò)后在項(xiàng)目根目錄下面執(zhí)行下面的命令即可開(kāi)始測(cè)試:

          $?make?install

          上面的命令會(huì)使用 Helm 進(jìn)行安裝:

          $?kubectl?get?pods?????????
          NAME??????????????????????????????????????????????????????????READY???STATUS????RESTARTS???AGE
          grafana-db468ccf9-mtn87???????????????????????????????????????1/1?????Running???0??????????90m
          my-bench-prometheus-benchmark-nodeexporter-76c497cf59-m5k66???2/2?????Running???0??????????49m
          my-bench-prometheus-benchmark-vmagent-vm-0-6bcbbb5fd8-8rhcx???2/2?????Running???0??????????49m
          my-bench-prometheus-benchmark-vmalert-vm-0-6f6b565ccc-snsk5???2/2?????Running???0??????????49m
          my-bench-prometheus-benchmark-vmsingle-585579fbf5-cpzhg???????1/1?????Running???0??????????49m
          $?kubectl?get?svc?
          NAME?????????????????????????????????????????TYPE????????CLUSTER-IP??????EXTERNAL-IP???PORT(S)????AGE
          my-bench-prometheus-benchmark-nodeexporter???ClusterIP???10.96.156.144???????????9102/TCP???50m
          my-bench-prometheus-benchmark-vmsingle???????ClusterIP???10.96.75.242????????????8428/TCP???50m

          其中的 node-exporter 應(yīng)用包含兩個(gè)容器,一個(gè)是應(yīng)用本身,另外一個(gè)是 nginx 包裝了一層用于緩解 node-exporter 的壓力。然后我們可以給 Grafana 配置監(jiān)控?cái)?shù)據(jù)源 http://my-bench-prometheus-benchmark-vmsingle:8428,導(dǎo)入幾個(gè) VictorialMetrics 官方提供的 Grafana Dashboard:https://grafana.com/orgs/victoriametrics

          下圖為 vmagent 的監(jiān)控圖表,從圖表上可以看出現(xiàn)在抓取的指標(biāo)目標(biāo)為2000,和上面配置一致,而且沒(méi)人出現(xiàn)任何錯(cuò)誤,證明現(xiàn)在 vmagent 還在壓力范圍內(nèi)。

          同樣我們也可以查看下 VictoriaMetrics 這個(gè)單節(jié)點(diǎn)的遠(yuǎn)程存儲(chǔ)的監(jiān)控圖表,可以看到目前也可以正常運(yùn)行,活躍的時(shí)間序列也達(dá)到了一百多萬(wàn)。
          除了可以從上面的監(jiān)控圖表來(lái)查看壓力測(cè)試的結(jié)果之外,我們還可以去通過(guò)下面的一些指標(biāo)來(lái)驗(yàn)證是否達(dá)到了壓力位了:
          • 數(shù)據(jù)攝取率:
          sum(rate(vm_promscrape_scraped_samples_sum{job="vmagent"})) by (remote_storage_name)
          • 將數(shù)據(jù)包發(fā)送到配置的遠(yuǎn)程存儲(chǔ)時(shí)丟棄的數(shù)據(jù)包數(shù)。如果該值大于零,則遠(yuǎn)程存儲(chǔ)拒絕接受傳入的數(shù)據(jù)。在這種情況下,建議檢查遠(yuǎn)程存儲(chǔ)日志和 vmagent 日志。
          sum(rate(vmagent_remotewrite_packets_dropped_total{job="vmagent"})) by (remote_storage_name)
          • 將數(shù)據(jù)發(fā)送到遠(yuǎn)程存儲(chǔ)時(shí)的重試次數(shù)。如果該值大于零,那么這表明遠(yuǎn)程存儲(chǔ)無(wú)法處理工作負(fù)載。在這種情況下,建議檢查遠(yuǎn)程存儲(chǔ)日志和 vmagent 日志。
          sum(rate(vmagent_remotewrite_retries_count_total{job="vmagent"})) by (remote_storage_name)
          • vmagent 端尚未發(fā)送到遠(yuǎn)程存儲(chǔ)的掛起數(shù)據(jù)量。如果圖形增長(zhǎng),則遠(yuǎn)程存儲(chǔ)無(wú)法跟上給定的數(shù)據(jù)攝取率??梢試L試在 chart/values.yaml 中增加 writeConcurrency,如果 prometheus benchmark 的 vmagent 之間存在較高的網(wǎng)絡(luò)延遲,則可能會(huì)有所幫助。
          sum(vm_persistentqueue_bytes_pending{job="vmagent"}) by (remote_storage_name)
          • chart/files/alerts.yaml 執(zhí)行查詢(xún)時(shí)的錯(cuò)誤數(shù)。如果該值大于零,則遠(yuǎn)程存儲(chǔ)無(wú)法處理查詢(xún)工作負(fù)載。在這種情況下,建議檢查遠(yuǎn)程存儲(chǔ)日志和 vmalert 日志。
          sum(rate(vmalert_execution_errors_total{job="vmalert"})) by (remote_storage_name)

          這些指標(biāo)我們可以通過(guò)執(zhí)行 make monitor 命令來(lái)進(jìn)行查詢(xún):

          $?make?monitor
          kubectl?port-forward?`kubectl?-n?default?get?pods?-n?default?-l?'job=vmsingle,chart-name=my-bench-prometheus-benchmark'?-o?name`?8428
          Forwarding?from?127.0.0.1:8428?->?8428
          Forwarding?from?[::1]:8428?->?8428

          然后可以在瀏覽器中訪(fǎng)問(wèn) http://127.0.0.1:8428/vmui 來(lái)驗(yàn)證上面的指標(biāo),如下所示:

          從我們這里的測(cè)試來(lái)看 2000 個(gè)抓取目標(biāo)并沒(méi)有達(dá)到上限,所以我們還可以繼續(xù)增加該值進(jìn)行測(cè)試,比如增加到 2500,如果各個(gè)組件都還可以正常運(yùn)行,那么再增加到 3000 繼續(xù)測(cè)試:
          $?make?install

          比如我這里測(cè)試 4000 個(gè)抓取目錄,也就是 800 metrics-per-target * 4k targets / 10s = 320k samples/s 仍然可以正常運(yùn)行,看來(lái) VictoriaMetrics 官方說(shuō)的單節(jié)點(diǎn)模式可以支持每秒一百萬(wàn)樣本的數(shù)據(jù)應(yīng)該還是相對(duì)靠譜的。

          使用下面的命令即可結(jié)束測(cè)試:

          make?delete

          總結(jié)

          在通過(guò)對(duì) Prometheus remote_write 協(xié)議接受數(shù)據(jù)的不同解決方案或同一解決方案的不同版本進(jìn)行壓力測(cè)試比較時(shí),我們應(yīng)該會(huì)得出一個(gè)大致的結(jié)論了。例如,Prometheus 本身、Cortex、Thanos、M3DB 和 TimescaleDB 等方案的性能表現(xiàn)如何,但是,我們始終建議不要簡(jiǎn)單地相信這些基準(zhǔn)測(cè)試,而是要驗(yàn)證生產(chǎn)數(shù)據(jù)的數(shù)量和資源使用情況。

          瀏覽 165
          點(diǎn)贊
          評(píng)論
          收藏
          分享

          手機(jī)掃一掃分享

          分享
          舉報(bào)
          評(píng)論
          圖片
          表情
          推薦
          點(diǎn)贊
          評(píng)論
          收藏
          分享

          手機(jī)掃一掃分享

          分享
          舉報(bào)
          <kbd id="afajh"><form id="afajh"></form></kbd>
          <strong id="afajh"><dl id="afajh"></dl></strong>
            <del id="afajh"><form id="afajh"></form></del>
                1. <th id="afajh"><progress id="afajh"></progress></th>
                  <b id="afajh"><abbr id="afajh"></abbr></b>
                  <th id="afajh"><progress id="afajh"></progress></th>
                  做爱福利导航 | 天堂在线中文gv | 一级毛片视频 | 国产三级日本三级在线播放 | 亚洲无码黄色成人网站在线观看 |