Prometheus监控神器-Kubernetes篇(八)

icegoblin
发布于 2022-7-4 17:02
浏览
0收藏

 

在Kubernetes中手动方式部署Prometheus联邦.

Prometheus监控神器-Kubernetes篇(八)-鸿蒙开发者社区
 monitor-prom

当我们有多个Kubernetes集群的时候,这个时候就需要需要指标汇总的需求了,如上图一样,我们假定在外部部署一个Prometheus的Federate,然后去采集当前k8s中的kube-system与default俩个 namespace。

 

环境
我的本地环境使用的 sealos 一键部署,主要是为了便于测试。

Prometheus监控神器-Kubernetes篇(八)-鸿蒙开发者社区
部署 Prometheus联邦集群
创建prometheus-federate数据目录

# 在m1上执行
mkdir /data/prometheus-federate/
chown -R 65534:65534 /data/prometheus-federate/

创建Prometheus联邦 StorageClass 配置文件

cd /data/manual-deploy/prometheus/
cat prometheus-federate-storageclass.yaml
apiVersion: storage.k8s.io/v1
kind: StorageClass
metadata:
  name: prometheus-federate-lpv
provisioner: kubernetes.io/no-provisioner
volumeBindingMode: WaitForFirstConsumer

创建Prometheus联邦pv配置文件

apiVersion: v1
kind: PersistentVolume
metadata:
  name: prometheus-federate-lpv-0
spec:
  capacity:
    storage: 10Gi
  volumeMode: Filesystem
  accessModes:
  - ReadWriteOnce
  persistentVolumeReclaimPolicy: Retain
  storageClassName: prometheus-federate-lpv
  local:
    path: /data/prometheus-federate
  nodeAffinity:
    required:
      nodeSelectorTerms:
      - matchExpressions:
        - key: kubernetes.io/hostname
          operator: In
          values:
          - sealos-k8s-m1

创建Prometheus联邦configmap配置文件

cat prometheus-federate-configmap.yaml
apiVersion: v1
kind: ConfigMap
metadata:
  name: prometheus-federate-config
  namespace: kube-system
data:
  alertmanager_rules.yaml: |
    groups:
    - name: example
      rules:
      - alert: InstanceDown
        expr: up == 0
        for: 1m
        labels:
          severity: page
        annotations:
          summary: "Instance {{ $labels.instance }} down"
          description: "{{ $labels.instance }} of job {{ $labels.job }} has been down for more than 1 minutes."
      - alert: NodeMemoryUsage
        expr: (node_memory_MemTotal_bytes -(node_memory_MemFree_bytes+node_memory_Buffers_bytes+node_memory_Cached_bytes )) / node_memory_MemTotal_bytes * 100 > 80
        for: 1m
        labels:
          team: ops
        annotations:
          summary: "cluster:{{ $labels.cluster }} {{ $labels.instance }}: High Memory usage detected"
          description: "{{ $labels.instance }}: Memory usage is above 55% (current value is: {{ $value }}"
  prometheus.yml: |
    global:
      scrape_interval:     30s
      evaluation_interval: 30s
    alerting:
      alertmanagers:
      - static_configs:
        - targets:
            - alertmanager-0.alertmanager-operated:9093
            - alertmanager-1.alertmanager-operated:9093      
    rule_files:
      - "/etc/prometheus/alertmanager_rules.yaml"
    scrape_configs:
      - job_name: 'federate'
        scrape_interval: 30s
        honor_labels: true
        metrics_path: '/federate'
        params:
          'match[]':
            - '{job=~"kubernetes.*"}'
            - '{job="prometheus"}'
        static_configs:
          - targets:
            - 'prometheus-0.prometheus:9090'
            - 'prometheus-1.prometheus:9090'
            - 'prometheus-2.prometheus:9090'

 

欢迎大家关注我的公众号ID:k8stech


文章转自公众号:Kubernetes技术栈

已于2022-7-4 17:02:35修改
收藏
回复
举报
回复
    相关推荐