<kbd id="afajh"><form id="afajh"></form></kbd>
<strong id="afajh"><dl id="afajh"></dl></strong>
    <del id="afajh"><form id="afajh"></form></del>
        1. <th id="afajh"><progress id="afajh"></progress></th>
          <b id="afajh"><abbr id="afajh"></abbr></b>
          <th id="afajh"><progress id="afajh"></progress></th>

          深入解析日志管理神器Loki:從基礎(chǔ)到實(shí)戰(zhàn)應(yīng)用

          共 27041字,需瀏覽 55分鐘

           ·

          2024-07-23 16:41

          目錄

          • 1 Loki

            • 1.8.1 502 BadGateWay

            • 1.8.2 Ingester not ready: instance xx:9095 in state JOINING

            • 1.8.3 too many unhealthy instances in the ring

            • 1.8.4 Data source connected

            • 1.6.1 Logstash作為日志收集客戶端

            • 1.4.1 k8s部署

            • 1.4.2 裸機(jī)部署

            • 1.4.1.1 創(chuàng)建配置文件

            • 1.4.1.2 創(chuàng)建DaemonSet文件

            • 1.4.1.3 創(chuàng)建promtail應(yīng)用

            • 1.3.1 AllInOne部署模式

            • 1.3.2 裸機(jī)部署

            • 1.3.1.1 k8s部署

            • 1.3.1.2 創(chuàng)建configmap

            • 1.3.1.3 創(chuàng)建持久化存儲

            • 1.3.1.4 創(chuàng)建應(yīng)用

            • 1.3.1.5 驗(yàn)證部署結(jié)果

            • 1.2.1 日志解析格式

            • 1.2.2 日志搜集架構(gòu)模式

            • 1.2.3 Loki部署模式

            • 1.1 引言

            • 1.2 Loki工作方式

            • 1.3 服務(wù)端部署

            • 1.4 Promtail部署

            • 1.5 數(shù)據(jù)源

            • 1.6 其他客戶端配置

            • 1.7 Helm安裝

            • 1.8 故障解決方案


          1 Loki

          1.1 引言

          Loki 是一個輕量級的日志收集、分析的應(yīng)用,采用的是promtail的方式來獲取日志內(nèi)容并送到loki里面進(jìn)行存儲,最終在grafanadatasource里面添加數(shù)據(jù)源進(jìn)行日志的展示、查詢。

          官方文檔:https://kubernetes.io/docs/concepts/security/pod-security-policy

          loki的持久化存儲支持azure、gcs、s3、swift、local這5中類型,其中常用的是s3、local。另外,它還支持很多種日志搜集類型,像最常用的logstash、fluentbit也在官方支持的列表中。

          優(yōu)點(diǎn):

          • 支持的客戶端,如Promtail,F(xiàn)luentbit,F(xiàn)luentd,Vector,Logstash和Grafana Agent

          • 首選代理Promtail,可以多來源提取日志,包括本地日志文件,systemd,Windows事件日志,Docker日志記錄驅(qū)動程序等

          • 沒有日志格式要求,包括JSON,XML,CSV,logfmt,非結(jié)構(gòu)化文本

          • 使用與查詢指標(biāo)相同的語法查詢?nèi)罩?/p>

          • 日志查詢時允許動態(tài)篩選和轉(zhuǎn)換日志行

          • 可以輕松地計(jì)算日志中的需要的指標(biāo)

          • 引入時的最小索引意味著您可以在查詢時動態(tài)地對日志進(jìn)行切片和切塊,以便在出現(xiàn)新問題時回答它們

          • 云原生支持,使用Prometheus形式抓取數(shù)據(jù)

          各日志收集組件簡單對比

          名稱 安裝的組件 優(yōu)點(diǎn)
          ELK/EFK elasticsearch、logstash, kibana、filebeat、kafka/redis 支持自定義grok正則解析復(fù)雜日志內(nèi)容;dashboard支持主富的可視化展示
          Loki grafana、loki、promtail 占用資源??;grafana原生支持;查詢速度快

          1.2 Loki工作方式

          1.2.1 日志解析格式


          從上面的圖中我們可以看到,它在解析日志的時候是以index為主的,index包括時間戳和pod的部分label(其他labelfilenamecontainers等),其余的是日志內(nèi)容。具體查詢效果如下:

          {app="loki",namespace="kube-public"}為索引

          1.2.2 日志搜集架構(gòu)模式


          在使用過程中,官方推薦使用promtail做為agentDaemonSet方式部署在kubernetesworker節(jié)點(diǎn)上搜集日志。另外也可以用上面提到的其他日志收集工具來收取,這篇文章在結(jié)尾處會附上其他工具的配置方式。

          1.2.3 Loki部署模式

          Loki由許多組件微服務(wù)構(gòu)建而成,微服務(wù)組件有5個。在這5個里面添加緩存用來把數(shù)據(jù)放起來加快查詢。數(shù)據(jù)放在共享存儲里面配置memberlist_config部分并在實(shí)例之間共享狀態(tài),將Loki進(jìn)行無限橫向擴(kuò)展。

          在配置完memberlist_config部分后采用輪詢的方式查找數(shù)據(jù)。為了使用方便官方把所有的微服務(wù)編譯成一個二進(jìn)制,可以通過命令行參數(shù)-target控制,支持all、read、write,我們在部署時根據(jù)日志量的大小可以指定不同模式

          • all(讀寫模式)
            服務(wù)啟動后,我們做的數(shù)據(jù)查詢、數(shù)據(jù)寫入都是來自這一個節(jié)點(diǎn)

          • read/write(讀寫分離模式)
            在讀寫分離模式下運(yùn)行時fronted-query查詢會將流量轉(zhuǎn)發(fā)到read節(jié)點(diǎn)上。讀節(jié)點(diǎn)上保留了querier、ruler、fronted,寫節(jié)點(diǎn)上保留了distributor、ingester

          • 微服務(wù)模式運(yùn)行
            微服務(wù)模式運(yùn)行下,通過不同的配置參數(shù)啟動為不同的角色,每一個進(jìn)程都引用它的目標(biāo)角色服務(wù)。

          組件名稱 功能
          分發(fā)器/調(diào)度器(distributor) 驗(yàn)證數(shù)據(jù)合規(guī):數(shù)據(jù)排序; hash一致性, QPS限制, 轉(zhuǎn)發(fā),數(shù)據(jù)副本保證不丟失
          收集器(ingester) 時間戳排序: 文件系統(tǒng)支持: WAL預(yù)寫
          查詢前端 (query frontend) 提供頁面操作,向后端存儲發(fā)出數(shù)據(jù)查詢;查詢隊(duì)列 (query-queueing) 能夠防止大數(shù)據(jù)量查詢時觸發(fā)0OM;查詢分割 (query-split) 可以分割大批量查詢最后進(jìn)行數(shù)據(jù)聚臺
          查詢器Querier 使用loggl語言在后端存儲中查詢?nèi)罩?/td>
          緩存 將查詢到的日志緩存起來共后續(xù)使用,如果數(shù)據(jù)不完整重新查詢?nèi)笔У臄?shù)據(jù)

          1.3 服務(wù)端部署

          在部署之前需要準(zhǔn)備好一個k8s集群才行哦

          應(yīng)用 鏡像
          loki grafana/loki:2.5.0
          promtail grafana/promtail:2.5.0

          1.3.1 AllInOne部署模式

          1.3.1.1 k8s部署

          我們從github上下載的程序是沒有配置文件的,需要提前將文件準(zhǔn)備一份。這里提供了一份完整的allInOne配置文件,部分內(nèi)容進(jìn)行了優(yōu)化。

          配置文件內(nèi)容如下所示

          auth_enabled: false
          target: all
          ballast_bytes: 20480
          server:
          grpc_listen_port: 9095
          http_listen_port: 3100
          graceful_shutdown_timeout: 20s
          grpc_listen_address: "0.0.0.0"
          grpc_listen_network: "tcp"
          grpc_server_max_concurrent_streams: 100
          grpc_server_max_recv_msg_size: 4194304
          grpc_server_max_send_msg_size: 4194304
          http_server_idle_timeout: 2m
          http_listen_address: "0.0.0.0"
          http_listen_network: "tcp"
          http_server_read_timeout: 30s
          http_server_write_timeout: 20s
          log_source_ips_enabled: true
          # http_path_prefix如果需要更改,在推送日志的時候前綴都需要加指定的內(nèi)容
          # http_path_prefix: "/"
          register_instrumentation: true
          log_format: json
          log_level: info
          distributor:
          ring:
          heartbeat_timeout: 3s
          kvstore:
          prefix: collectors/
          store: memberlist
          # 需要提前創(chuàng)建好consul集群
          # consul:
          # http_client_timeout: 20s
          # consistent_reads: true
          # host: 127.0.0.1:8500
          # watch_burst_size: 2
          # watch_rate_limit: 2
          querier:
          engine:
          max_look_back_period: 20s
          timeout: 3m0s
          extra_query_delay: 100ms
          max_concurrent: 10
          multi_tenant_queries_enabled: true
          query_ingester_only: false
          query_ingesters_within: 3h0m0s
          query_store_only: false
          query_timeout: 5m0s
          tail_max_duration: 1h0s
          query_scheduler:
          max_outstanding_requests_per_tenant: 2048
          grpc_client_config:
          max_recv_msg_size: 104857600
          max_send_msg_size: 16777216
          grpc_compression: gzip
          rate_limit: 0
          rate_limit_burst: 0
          backoff_on_ratelimits: false
          backoff_config:
          min_period: 50ms
          max_period: 15s
          max_retries: 5
          use_scheduler_ring: true
          scheduler_ring:
          kvstore:
          store: memberlist
          prefix: "collectors/"
          heartbeat_period: 30s
          heartbeat_timeout: 1m0s
          # 默認(rèn)第一個網(wǎng)卡的名稱
          # instance_interface_names
          # instance_addr: 127.0.0.1
          # 默認(rèn)server.grpc-listen-port
          instance_port: 9095
          frontend:
          max_outstanding_per_tenant: 4096
          querier_forget_delay: 1h0s
          compress_responses: true
          log_queries_longer_than: 2m0s
          max_body_size: 104857600
          query_stats_enabled: true
          scheduler_dns_lookup_period: 10s
          scheduler_worker_concurrency: 15
          query_range:
          align_queries_with_step: true
          cache_results: true
          parallelise_shardable_queries: true
          max_retries: 3
          results_cache:
          cache:
          enable_fifocache: false
          default_validity: 30s
          background:
          writeback_buffer: 10000
          redis:
          endpoint: 127.0.0.1:6379
          timeout: 1s
          expiration: 0s
          db: 9
          pool_size: 128
          password: 1521Qyx6^
          tls_enabled: false
          tls_insecure_skip_verify: true
          idle_timeout: 10s
          max_connection_age: 8h
          ruler:
          enable_api: true
          enable_sharding: true
          alertmanager_refresh_interval: 1m
          disable_rule_group_label: false
          evaluation_interval: 1m0s
          flush_period: 3m0s
          for_grace_period: 20m0s
          for_outage_tolerance: 1h0s
          notification_queue_capacity: 10000
          notification_timeout: 4s
          poll_interval: 10m0s
          query_stats_enabled: true
          remote_write:
          config_refresh_period: 10s
          enabled: false
          resend_delay: 2m0s
          rule_path: /rulers
          search_pending_for: 5m0s
          storage:
          local:
          directory: /data/loki/rulers
          type: configdb
          sharding_strategy: default
          wal_cleaner:
          period: 240h
          min_age: 12h0m0s
          wal:
          dir: /data/loki/ruler_wal
          max_age: 4h0m0s
          min_age: 5m0s
          truncate_frequency: 1h0m0s
          ring:
          kvstore:
          store: memberlist
          prefix: "collectors/"
          heartbeat_period: 5s
          heartbeat_timeout: 1m0s
          # instance_addr: "127.0.0.1"
          # instance_id: "miyamoto.en0"
          # instance_interface_names: ["en0","lo0"]
          instance_port: 9500
          num_tokens: 100
          ingester_client:
          pool_config:
          health_check_ingesters: false
          client_cleanup_period: 10s
          remote_timeout: 3s
          remote_timeout: 5s
          ingester:
          autoforget_unhealthy: true
          chunk_encoding: gzip
          chunk_target_size: 1572864
          max_transfer_retries: 0
          sync_min_utilization: 3.5
          sync_period: 20s
          flush_check_period: 30s
          flush_op_timeout: 10m0s
          chunk_retain_period: 1m30s
          chunk_block_size: 262144
          chunk_idle_period: 1h0s
          max_returned_stream_errors: 20
          concurrent_flushes: 3
          index_shards: 32
          max_chunk_age: 2h0m0s
          query_store_max_look_back_period: 3h30m30s
          wal:
          enabled: true
          dir: /data/loki/wal
          flush_on_shutdown: true
          checkpoint_duration: 15m
          replay_memory_ceiling: 2GB
          lifecycler:
          ring:
          kvstore:
          store: memberlist
          prefix: "collectors/"
          heartbeat_timeout: 30s
          replication_factor: 1
          num_tokens: 128
          heartbeat_period: 5s
          join_after: 5s
          observe_period: 1m0s
          # interface_names: ["en0","lo0"]
          final_sleep: 10s
          min_ready_duration: 15s
          storage_config:
          boltdb:
          directory: /data/loki/boltdb
          boltdb_shipper:
          active_index_directory: /data/loki/active_index
          build_per_tenant_index: true
          cache_location: /data/loki/cache
          cache_ttl: 48h
          resync_interval: 5m
          query_ready_num_days: 5
          index_gateway_client:
          grpc_client_config:
          filesystem:
          directory: /data/loki/chunks
          chunk_store_config:
          chunk_cache_config:
          enable_fifocache: true
          default_validity: 30s
          background:
          writeback_buffer: 10000
          redis:
          endpoint: 192.168.3.56:6379
          timeout: 1s
          expiration: 0s
          db: 8
          pool_size: 128
          password: 1521Qyx6^
          tls_enabled: false
          tls_insecure_skip_verify: true
          idle_timeout: 10s
          max_connection_age: 8h
          fifocache:
          ttl: 1h
          validity: 30m0s
          max_size_items: 2000
          max_size_bytes: 500MB
          write_dedupe_cache_config:
          enable_fifocache: true
          default_validity: 30s
          background:
          writeback_buffer: 10000
          redis:
          endpoint: 127.0.0.1:6379
          timeout: 1s
          expiration: 0s
          db: 7
          pool_size: 128
          password: 1521Qyx6^
          tls_enabled: false
          tls_insecure_skip_verify: true
          idle_timeout: 10s
          max_connection_age: 8h
          fifocache:
          ttl: 1h
          validity: 30m0s
          max_size_items: 2000
          max_size_bytes: 500MB
          cache_lookups_older_than: 10s
          # 壓縮碎片索引
          compactor:
          shared_store: filesystem
          shared_store_key_prefix: index/
          working_directory: /data/loki/compactor
          compaction_interval: 10m0s
          retention_enabled: true
          retention_delete_delay: 2h0m0s
          retention_delete_worker_count: 150
          delete_request_cancel_period: 24h0m0s
          max_compaction_parallelism: 2
          # compactor_ring:
          frontend_worker:
          match_max_concurrent: true
          parallelism: 10
          dns_lookup_duration: 5s
          # runtime_config 這里沒有配置任何信息
          # runtime_config:
          common:
          storage:
          filesystem:
          chunks_directory: /data/loki/chunks
          fules_directory: /data/loki/rulers
          replication_factor: 3
          persist_tokens: false
          # instance_interface_names: ["en0","eth0","ens33"]
          analytics:
          reporting_enabled: false
          limits_config:
          ingestion_rate_strategy: global
          ingestion_rate_mb: 100
          ingestion_burst_size_mb: 18
          max_label_name_length: 2096
          max_label_value_length: 2048
          max_label_names_per_series: 60
          enforce_metric_name: true
          max_entries_limit_per_query: 5000
          reject_old_samples: true
          reject_old_samples_max_age: 168h
          creation_grace_period: 20m0s
          max_global_streams_per_user: 5000
          unordered_writes: true
          max_chunks_per_query: 200000
          max_query_length: 721h
          max_query_parallelism: 64
          max_query_series: 700
          cardinality_limit: 100000
          max_streams_matchers_per_query: 1000
          max_concurrent_tail_requests: 10
          ruler_evaluation_delay_duration: 3s
          ruler_max_rules_per_rule_group: 0
          ruler_max_rule_groups_per_tenant: 0
          retention_period: 700h
          per_tenant_override_period: 20s
          max_cache_freshness_per_query: 2m0s
          max_queriers_per_tenant: 0
          per_stream_rate_limit: 6MB
          per_stream_rate_limit_burst: 50MB
          max_query_lookback: 0
          ruler_remote_write_disabled: false
          min_sharding_lookback: 0s
          split_queries_by_interval: 10m0s
          max_line_size: 30mb
          max_line_size_truncate: false
          max_streams_per_user: 0

          # memberlist_conig模塊配置gossip用于在分發(fā)服務(wù)器、攝取器和查詢器之間發(fā)現(xiàn)和連接。
          # 所有三個組件的配置都是唯一的,以確保單個共享環(huán)。
          # 至少定義了1個join_members配置后,將自動為分發(fā)服務(wù)器、攝取器和ring 配置memberlist類型的kvstore
          memberlist:
          randomize_node_name: true
          stream_timeout: 5s
          retransmit_factor: 4
          join_members:
          - 'loki-memberlist'
          abort_if_cluster_join_fails: true
          advertise_addr: 0.0.0.0
          advertise_port: 7946
          bind_addr: ["0.0.0.0"]
          bind_port: 7946
          compression_enabled: true
          dead_node_reclaim_time: 30s
          gossip_interval: 100ms
          gossip_nodes: 3
          gossip_to_dead_nodes_time: 3
          # join:
          leave_timeout: 15s
          left_ingesters_timeout: 3m0s
          max_join_backoff: 1m0s
          max_join_retries: 5
          message_history_buffer_bytes: 4096
          min_join_backoff: 2s
          # node_name: miyamoto
          packet_dial_timeout: 5s
          packet_write_timeout: 5s
          pull_push_interval: 100ms
          rejoin_interval: 10s
          tls_enabled: false
          tls_insecure_skip_verify: true
          schema_config:
          configs:
          - from: "2020-10-24"
          index:
          period: 24h
          prefix: index_
          object_store: filesystem
          schema: v11
          store: boltdb-shipper
          chunks:
          period: 168h
          row_shards: 32
          table_manager:
          retention_deletes_enabled: false
          retention_period: 0s
          throughput_updates_disabled: false
          poll_interval: 3m0s
          creation_grace_period: 20m
          index_tables_provisioning:
          provisioned_write_throughput: 1000
          provisioned_read_throughput: 500
          inactive_write_throughput: 4
          inactive_read_throughput: 300
          inactive_write_scale_lastn: 50
          enable_inactive_throughput_on_demand_mode: true
          enable_ondemand_throughput_mode: true
          inactive_read_scale_lastn: 10
          write_scale:
          enabled: true
          target: 80
          # role_arn:
          out_cooldown: 1800
          min_capacity: 3000
          max_capacity: 6000
          in_cooldown: 1800
          inactive_write_scale:
          enabled: true
          target: 80
          out_cooldown: 1800
          min_capacity: 3000
          max_capacity: 6000
          in_cooldown: 1800
          read_scale:
          enabled: true
          target: 80
          out_cooldown: 1800
          min_capacity: 3000
          max_capacity: 6000
          in_cooldown: 1800
          inactive_read_scale:
          enabled: true
          target: 80
          out_cooldown: 1800
          min_capacity: 3000
          max_capacity: 6000
          in_cooldown: 1800
          chunk_tables_provisioning:
          enable_inactive_throughput_on_demand_mode: true
          enable_ondemand_throughput_mode: true
          provisioned_write_throughput: 1000
          provisioned_read_throughput: 300
          inactive_write_throughput: 1
          inactive_write_scale_lastn: 50
          inactive_read_throughput: 300
          inactive_read_scale_lastn: 10
          write_scale:
          enabled: true
          target: 80
          out_cooldown: 1800
          min_capacity: 3000
          max_capacity: 6000
          in_cooldown: 1800
          inactive_write_scale:
          enabled: true
          target: 80
          out_cooldown: 1800
          min_capacity: 3000
          max_capacity: 6000
          in_cooldown: 1800
          read_scale:
          enabled: true
          target: 80
          out_cooldown: 1800
          min_capacity: 3000
          max_capacity: 6000
          in_cooldown: 1800
          inactive_read_scale:
          enabled: true
          target: 80
          out_cooldown: 1800
          min_capacity: 3000
          max_capacity: 6000
          in_cooldown: 1800
          tracing:
          enabled: true

          注意

          • ingester.lifecycler.ring.replication_factor 的值在單實(shí)例的情況下為1

          • ingester.lifecycler.min_ready_duration的值為15s,在啟動后默認(rèn)會顯示15秒將狀態(tài)變?yōu)?code style="margin-right: 3px;margin-left: 3px;padding-right: 5px;padding-left: 5px;font-family: ui-monospace, SFMono-Regular, SF Mono, Menlo, Consolas, Liberation Mono, monospace, sans-serif;font-size: 12px;line-height: 1.8;display: inline-block;overflow-x: auto;vertical-align: middle;border-radius: 3px;background-color: rgb(251, 229, 225);color: rgb(192, 52, 29);border-width: initial !important;border-style: none !important;border-color: initial !important;">ready

          • memberlist.node_name的值可以不用設(shè)置,默認(rèn)是當(dāng)前主機(jī)的名稱

          • memberlist.join_members是一個列表,在有多個實(shí)例的情況下需要添加各個節(jié)點(diǎn)的主機(jī)名/IP地址。在k8s里面可以設(shè)置成一個service綁定到StatefulSets

          • query_range.results_cache.cache.enable_fifocache建議設(shè)置為false,我這里設(shè)置成了true

          • instance_interface_names是一個列表,默認(rèn)的為["en0","eth0"],可以根據(jù)需要設(shè)置對應(yīng)的網(wǎng)卡名稱,一般不需要進(jìn)行特殊設(shè)置。

          1.3.1.2 創(chuàng)建configmap

          將上面的內(nèi)容寫入到一個文件——>loki-all.yaml,把它作為一個configmap寫入k8s集群??梢允褂萌缦旅顒?chuàng)建:

          kubectl create configmap --from-file ./loki-all.yaml loki-all

          可以通過命令查看到已經(jīng)創(chuàng)建好的configmap,具體操作詳見下圖

          1.3.1.3 創(chuàng)建持久化存儲

          在k8s里面我們的數(shù)據(jù)是需要進(jìn)行持久化的。Loki收集起來的日志信息對于業(yè)務(wù)來說是至關(guān)重要的,因此需要在容器重啟的時候日志能夠保留下來。
          那么就需要用到pv、pvc,后端存儲可以使用nfs、glusterfs、hostPath、azureDisk、cephfs等20種支持類型,這里因?yàn)闆]有對應(yīng)的環(huán)境就采用了hostPath方式。

          apiVersion: v1
          kind: PersistentVolume
          metadata:
          name: loki
          namespace: default
          spec:
          hostPath:
          path: /glusterfs/loki
          type: DirectoryOrCreate
          capacity:
          storage: 1Gi
          accessModes:
          - ReadWriteMany
          ---
          apiVersion: v1
          kind: PersistentVolumeClaim
          metadata:
          name: loki
          namespace: default
          spec:
          accessModes:
          - ReadWriteMany
          resources:
          requests:
          storage: 1Gi
          volumeName: loki

          1.3.1.4 創(chuàng)建應(yīng)用

          準(zhǔn)備好k8s的StatefulSet部署文件后就可以直接在集群里面創(chuàng)建應(yīng)用了。

          apiVersion: apps/v1
          kind: StatefulSet
          metadata:
          labels:
          app: loki
          name: loki
          namespace: default
          spec:
          podManagementPolicy: OrderedReady
          replicas: 1
          selector:
          matchLabels:
          app: loki
          template:
          metadata:
          annotations:
          prometheus.io/port: http-metrics
          prometheus.io/scrape: "true"
          labels:
          app: loki
          spec:
          containers:
          - args:
          - -config.file=/etc/loki/loki-all.yaml
          image: grafana/loki:2.5.0
          imagePullPolicy: IfNotPresent
          livenessProbe:
          failureThreshold: 3
          httpGet:
          path: /ready
          port: http-metrics
          scheme: HTTP
          initialDelaySeconds: 45
          periodSeconds: 10
          successThreshold: 1
          timeoutSeconds: 1
          name: loki
          ports:
          - containerPort: 3100
          name: http-metrics
          protocol: TCP
          - containerPort: 9095
          name: grpc
          protocol: TCP
          - containerPort: 7946
          name: memberlist-port
          protocol: TCP
          readinessProbe:
          failureThreshold: 3
          httpGet:
          path: /ready
          port: http-metrics
          scheme: HTTP
          initialDelaySeconds: 45
          periodSeconds: 10
          successThreshold: 1
          timeoutSeconds: 1
          resources:
          requests:
          cpu: 500m
          memory: 500Mi
          limits:
          cpu: 500m
          memory: 500Mi
          securityContext:
          readOnlyRootFilesystem: true
          volumeMounts:
          - mountPath: /etc/loki
          name: config
          - mountPath: /data
          name: storage
          restartPolicy: Always
          securityContext:
          fsGroup: 10001
          runAsGroup: 10001
          runAsNonRoot: true
          runAsUser: 10001
          serviceAccount: loki
          serviceAccountName: loki
          volumes:
          - emptyDir: {}
          name: tmp
          - name: config
          configMap:
          name: loki
          - persistentVolumeClaim:
          claimName: loki
          name: storage
          ---
          kind: Service
          apiVersion: v1
          metadata:
          name: loki-memberlist
          namespace: default
          spec:
          ports:
          - name: loki-memberlist
          protocol: TCP
          port: 7946
          targetPort: 7946
          selector:
          kubepi.org/name: loki
          ---
          kind: Service
          apiVersion: v1
          metadata:
          name: loki
          namespace: default
          spec:
          ports:
          - name: loki
          protocol: TCP
          port: 3100
          targetPort: 3100
          selector:
          kubepi.org/name: loki

          在上面的配置文件中我添加了一些pod級別的安全策略,這些安全策略還有集群級別的PodSecurityPolicy,防止因?yàn)槁┒吹脑蛟斐杉旱恼麄€崩潰

          1.3.1.5 驗(yàn)證部署結(jié)果

          當(dāng)看到上面的Running狀態(tài)時可以通過API的方式看一下分發(fā)器是不是正常工作,當(dāng)顯示Active時正常才會正常分發(fā)日志流到收集器(ingester

          1.3.2 裸機(jī)部署

          loki放到系統(tǒng)的/bin/目錄下,準(zhǔn)備grafana-loki.service控制文件重載系統(tǒng)服務(wù)列表

          [Unit]
          Description=Grafana Loki Log Ingester
          Documentation=https://grafana.com/logs/
          After=network-online.target

          [Service]
          ExecStart=/bin/loki --config.file /etc/loki/loki-all.yaml
          ExecReload=/bin/kill -s HUP $MAINPID
          ExecStop=/bin/kill -s TERM $MAINPID

          [Install]
          WantedBy=multi-user.target

          重載系統(tǒng)列表命令,可以直接系統(tǒng)自動管理服務(wù):

          systemctl daemon-reload
          # 啟動服務(wù)
          systemctl start grafana-loki
          # 停止服務(wù)
          systemctl stop grafana-loki
          # 重載應(yīng)用
          systemctl reload grafana-loki

          1.4 Promtail部署

          部署客戶端收集日志時也需要創(chuàng)建一個配置文件,按照上面創(chuàng)建服務(wù)端的步驟創(chuàng)建。不同的是需要把日志內(nèi)容push到服務(wù)端

          1.4.1 k8s部署

          1.4.1.1 創(chuàng)建配置文件

          server:
          log_level: info
          http_listen_port: 3101
          clients:
          - url: http://loki:3100/loki/api/v1/push
          positions:
          filename: /run/promtail/positions.yaml
          scrape_configs:
          - job_name: kubernetes-pods
          pipeline_stages:
          - cri: {}
          kubernetes_sd_configs:
          - role: pod
          relabel_configs:
          - source_labels:
          - __meta_kubernetes_pod_controller_name
          regex: ([0-9a-z-.]+?)(-[0-9a-f]{8,10})?
          action: replace
          target_label: __tmp_controller_name
          - source_labels:
          - __meta_kubernetes_pod_label_app_kubernetes_io_name
          - __meta_kubernetes_pod_label_app
          - __tmp_controller_name
          - __meta_kubernetes_pod_name
          regex: ^;*([^;]+)(;.*)?$
          action: replace
          target_label: app
          - source_labels:
          - __meta_kubernetes_pod_label_app_kubernetes_io_instance
          - __meta_kubernetes_pod_label_release
          regex: ^;*([^;]+)(;.*)?$
          action: replace
          target_label: instance
          - source_labels:
          - __meta_kubernetes_pod_label_app_kubernetes_io_component
          - __meta_kubernetes_pod_label_component
          regex: ^;*([^;]+)(;.*)?$
          action: replace
          target_label: component
          - action: replace
          source_labels:
          - __meta_kubernetes_pod_node_name
          target_label: node_name
          - action: replace
          source_labels:
          - __meta_kubernetes_namespace
          target_label: namespace
          - action: replace
          replacement: $1
          separator: /
          source_labels:
          - namespace
          - app
          target_label: job
          - action: replace
          source_labels:
          - __meta_kubernetes_pod_name
          target_label: pod
          - action: replace
          source_labels:
          - __meta_kubernetes_pod_container_name
          target_label: container
          - action: replace
          replacement: /var/log/pods/*$1/*.log
          separator: /
          source_labels:
          - __meta_kubernetes_pod_uid
          - __meta_kubernetes_pod_container_name
          target_label: __path__
          - action: replace
          regex: true/(.*)
          replacement: /var/log/pods/*$1/*.log
          separator: /
          source_labels:
          - __meta_kubernetes_pod_annotationpresent_kubernetes_io_config_hash
          - __meta_kubernetes_pod_annotation_kubernetes_io_config_hash
          - __meta_kubernetes_pod_container_name
          target_label: __path__

          用上面的內(nèi)容創(chuàng)建一個configMap,方法同上

          1.4.1.2 創(chuàng)建DaemonSet文件

          Promtail是一個無狀態(tài)應(yīng)用不需要進(jìn)行持久化存儲只需要部署到集群里面就可以了,還是同樣的準(zhǔn)備DaemonSets創(chuàng)建文件。

          kind: DaemonSet
          apiVersion: apps/v1
          metadata:
          name: promtail
          namespace: default
          labels:
          app.kubernetes.io/instance: promtail
          app.kubernetes.io/name: promtail
          app.kubernetes.io/version: 2.5.0
          spec:
          selector:
          matchLabels:
          app.kubernetes.io/instance: promtail
          app.kubernetes.io/name: promtail
          template:
          metadata:
          labels:
          app.kubernetes.io/instance: promtail
          app.kubernetes.io/name: promtail
          spec:
          volumes:
          - name: config
          configMap:
          name: promtail
          - name: run
          hostPath:
          path: /run/promtail
          - name: containers
          hostPath:
          path: /var/lib/docker/containers
          - name: pods
          hostPath:
          path: /var/log/pods
          containers:
          - name: promtail
          image: docker.io/grafana/promtail:2.3.0
          args:
          - '-config.file=/etc/promtail/promtail.yaml'
          ports:
          - name: http-metrics
          containerPort: 3101
          protocol: TCP
          env:
          - name: HOSTNAME
          valueFrom:
          fieldRef:
          apiVersion: v1
          fieldPath: spec.nodeName
          volumeMounts:
          - name: config
          mountPath: /etc/promtail
          - name: run
          mountPath: /run/promtail
          - name: containers
          readOnly: true
          mountPath: /var/lib/docker/containers
          - name: pods
          readOnly: true
          mountPath: /var/log/pods
          readinessProbe:
          httpGet:
          path: /ready
          port: http-metrics
          scheme: HTTP
          initialDelaySeconds: 10
          timeoutSeconds: 1
          periodSeconds: 10
          successThreshold: 1
          failureThreshold: 5
          imagePullPolicy: IfNotPresent
          securityContext:
          capabilities:
          drop:
          - ALL
          readOnlyRootFilesystem: false
          allowPrivilegeEscalation: false
          restartPolicy: Always
          serviceAccountName: promtail
          serviceAccount: promtail
          tolerations:
          - key: node-role.kubernetes.io/master
          operator: Exists
          effect: NoSchedule
          - key: node-role.kubernetes.io/control-plane
          operator: Exists
          effect: NoSchedule

          1.4.1.3 創(chuàng)建promtail應(yīng)用

          kubectl apply -f promtail.yaml

          使用上面這個命令創(chuàng)建后就可以看到服務(wù)已經(jīng)創(chuàng)建好了。接下來就是在Grafana里面添加DataSource查看數(shù)據(jù)了。

          1.4.2 裸機(jī)部署

          如果是裸機(jī)部署的情況下,需要對上面的配置文件做一下稍微的改動,更改clients的地址就可以,文件存放到/etc/loki/下,例如改成:

          clients:
          - url: http://ipaddress:port/loki/api/v1/push

          添加系統(tǒng)開機(jī)啟動配置,service配置文件存放位置/usr/lib/systemd/system/loki-promtail.service內(nèi)容如下

          [Unit]
          Description=Grafana Loki Log Ingester
          Documentation=https://grafana.com/logs/
          After=network-online.target

          [Service]
          ExecStart=/bin/promtail --config.file /etc/loki/loki-promtail.yaml
          ExecReload=/bin/kill -s HUP $MAINPID
          ExecStop=/bin/kill -s TERM $MAINPID

          [Install]
          WantedBy=multi-user.target

          啟動方式同上面服務(wù)端部署內(nèi)容

          1.5 數(shù)據(jù)源

          添加數(shù)據(jù)源,具體步驟: Grafana->Setting->DataSources->AddDataSource->Loki

          注意httpURL地址,應(yīng)用、服務(wù)部署在哪個namespace下,就需要指定它的FQDN地址,它的格式是ServiceName.namespace。如果默認(rèn)在default下、創(chuàng)建的端口號是3100,就需要填寫為http://loki:3100,這里為什么不寫IP地址而寫成服務(wù)的名字,是因?yàn)樵趉8s集群里面有個dns服務(wù)器會自動解析這個地址。

          查找日志信息

          1.6 其他客戶端配置

          1.6.1 Logstash作為日志收集客戶端

          在啟動Logstash后我們需要安裝一個插件,可以通過這個命令安裝loki的輸出插件,安裝完成之后可以在logstashoutput中添加信息。

          bin/logstash-plugin install logstash-output-loki

          添加配置進(jìn)行測試
          完整的logstash配置信息,可以參考官網(wǎng)給出的內(nèi)容LogstashConfigFile

          output {
          loki {
          [url => "" | default = none | required=true]
          [tenant_id => string | default = nil | required=false]
          [message_field => string | default = "message" | required=false]
          [include_fields => array | default = [] | required=false]
          [batch_wait => number | default = 1(s) | required=false]
          [batch_size => number | default = 102400(bytes) | required=false]
          [min_delay => number | default = 1(s) | required=false]
          [max_delay => number | default = 300(s) | required=false]
          [retries => number | default = 10 | required=false]
          [username => string | default = nil | required=false]
          [password => secret | default = nil | required=false]
          [cert => path | default = nil | required=false]
          [key => path | default = nil| required=false]
          [ca_cert => path | default = nil | required=false]
          [insecure_skip_verify => boolean | default = false | required=false]
          }
          }

          或者采用logstashhttp輸出模塊,配置如下:

          output {
          http {
          format => "json"
          http_method => "post"
          content_type => "application/json"
          connect_timeout => 10
          url => "http://loki:3100/loki/api/v1/push"
          message => '"message":"%{message}"}'
          }
          }

          1.7 Helm安裝

          如果想簡便安裝的話,可以采用helm來安裝。helm將所有的安裝步驟都進(jìn)行了封裝,簡化了安裝步驟。

          對于想詳細(xì)了解k8s的人來說,helm不太適合。因?yàn)樗庋b后自動執(zhí)行,k8s管理員不知道各組件之間是如何依賴的,可能會造成誤區(qū)。

          廢話不多說,下面開始helm安裝:

          • 添加repo源
            helm repo add grafana https://grafana.github.io/helm-charts

          • 更新源
            helm repo update

          • 部署
            默認(rèn)配置
            helm upgrade --install loki grafana/loki-simple-scalable
            自定義namespace
            helm upgrade --install loki --namespace=loki grafana/loki-simple-scalable
            自定義配置信息
            helm upgrade --install loki grafana/loki-simple-scalable --set "key1=val1,key2=val2,..."

          1.8 故障解決方案

          1.8.1 502 BadGateWay

          loki的地址填寫不正確
          在k8s里面,地址填寫錯誤造成了502。檢查一下loki的地址是否是以下內(nèi)容:

          http://LokiServiceName
          http://LokiServiceName.namespace
          http://LokiServiceName.namespace:ServicePort

          grafana和loki在不同的節(jié)點(diǎn)上,檢查一下節(jié)點(diǎn)間網(wǎng)絡(luò)通信狀態(tài)、防火墻策略

          1.8.2 Ingester not ready: instance xx:9095 in state JOINING

          耐心等待一會,因?yàn)槭?code style="margin-right: 3px;margin-left: 3px;padding-right: 5px;padding-left: 5px;font-family: ui-monospace, SFMono-Regular, SF Mono, Menlo, Consolas, Liberation Mono, monospace, sans-serif;font-size: 12px;line-height: 1.8;display: inline-block;overflow-x: auto;vertical-align: middle;border-radius: 3px;background-color: rgb(251, 229, 225);color: rgb(192, 52, 29);border-width: initial !important;border-style: none !important;border-color: initial !important;">allInOne模式程序啟動需要一定的時間。

          1.8.3 too many unhealthy instances in the ring

          ingester.lifecycler.replication_factor改為1,是因?yàn)檫@個設(shè)置不正確造成的。這個在啟動的時候會設(shè)置為多個復(fù)制源,但當(dāng)前只部署了一個所以在查看label的時候提示這個

          1.8.4 Data source connected

          Data source connected, but no labels received. Verify that Loki and Promtail is configured properly

          • promtail無法將收集到的日志發(fā)送給loki,許可檢查一下promtail的輸出是不是正常

          • promtail在loki還沒有準(zhǔn)備就緒的時候把日志發(fā)送過來了,但loki沒有接收到。如果需要重新接收日志,需要刪除positions.yaml文件,具體路徑可以用find查找一下位置

          • promtail忽略了目標(biāo)日志文件或者配置文件錯誤造成的無法正常啟動

          • promtail無法在指定的位置發(fā)現(xiàn)日志文件

          鏈接:https://www.cnblogs.com/jingzh/p/17998082

          (版權(quán)歸原作者所有,侵刪)

          瀏覽 181
          點(diǎn)贊
          評論
          收藏
          分享

          手機(jī)掃一掃分享

          分享
          舉報
          評論
          圖片
          表情
          推薦
          點(diǎn)贊
          評論
          收藏
          分享

          手機(jī)掃一掃分享

          分享
          舉報
          <kbd id="afajh"><form id="afajh"></form></kbd>
          <strong id="afajh"><dl id="afajh"></dl></strong>
            <del id="afajh"><form id="afajh"></form></del>
                1. <th id="afajh"><progress id="afajh"></progress></th>
                  <b id="afajh"><abbr id="afajh"></abbr></b>
                  <th id="afajh"><progress id="afajh"></progress></th>
                  美国av小说网站在线观看 | 欧美成人在线三级片 | 亚洲免费a | 国产精品扒开腿 | 日韩午夜久久 |