Prometheus监控Docker Swarm集群(三)

icegoblin
发布于 2022-7-4 17:07
浏览
0收藏

监控Swarm集群
OK,Swarm集群初始化已经完成,基于cAdvisor+InfluxDB+Grafana的yaml脚本

cat docker-compose-monitor.yml
version: '3'
 
services:
  influx:
    image: influxdb
    volumes:
      - influx:/var/lib/influxdb
    deploy:
      replicas: 1
      placement:
        constraints:
          - node.role == manager
 
  grafana:
    image: grafana/grafana
    ports:
      - 0.0.0.0:80:3000
    volumes:
      - grafana:/var/lib/grafana
    depends_on:
      - influx
    deploy:
      replicas: 1
      placement:
        constraints:
          - node.role == manager
 
  cadvisor:
    image: google/cadvisor
    hostname: '{{.Node.Hostname}}'
    command: -logtostderr -docker_only -storage_driver=influxdb -storage_driver_db=cadvisor -storage_driver_host=influx:8086
    volumes:
      - /:/rootfs:ro
      - /var/run:/var/run:rw
      - /sys:/sys:ro
      - /var/lib/docker/:/var/lib/docker:ro
    depends_on:
      - influx
    deploy:
      mode: global
 
volumes:
  influx:
    driver: local
  grafana:
    driver: local

我们这里只讲第二种,基于cAdvisor+Prometheus+Grafana的方案。

git clone https://github.com/cyancow/swarmprom.git
cd swarmprom
ADMIN_USER=admin \
ADMIN_PASSWORD=admin \
SLACK_URL=https://hooks.slack.com/services/9935226 \
SLACK_CHANNEL=devops-alerts \
SLACK_USER=alertmanager \
docker stack deploy -c docker-compose.yml mon
# output 
Creating network mon_net
Creating config mon_caddy_config
Creating config mon_dockerd_config
Creating config mon_node_rules
Creating config mon_task_rules
Creating service mon_prometheus
Creating service mon_caddy
Creating service mon_dockerd-exporter
Creating service mon_cadvisor
Creating service mon_grafana
Creating service mon_alertmanager
Creating service mon_unsee
Creating service mon_node-exporter

# 查看部署的stack
docker stack ls
NAME                SERVICES            ORCHESTRATOR
mon                 8                   Swarm

# 查看部署的service
docker service ls
ID                  NAME                   MODE                REPLICAS            IMAGE                                          PORTS
xnkq61woc3ag        mon_alertmanager       replicated          1/1                 stefanprodan/swarmprom-alertmanager:v0.14.0
tzxe317tffgl        mon_caddy              replicated          1/1                 stefanprodan/caddy:latest                      *:3000->3000/tcp, *:9090->9090/tcp, *:9093-9094->9093-9094/tcp
06rv2rj9oxbo        mon_cadvisor           global              3/3                 google/cadvisor:latest
ropkluyyxora        mon_dockerd-exporter   global              3/3                 stefanprodan/caddy:latest
29ygw9r4a92c        mon_grafana            replicated          1/1                 stefanprodan/swarmprom-grafana:5.3.4
whqtwwmfvdjl        mon_node-exporter      global              3/3                 stefanprodan/swarmprom-node-exporter:v0.16.0
xv19nuesymol        mon_prometheus         replicated          1/1                 stefanprodan/swarmprom-prometheus:v2.5.0
ia2g1ayhzjf6        mon_unsee              replicated          1/1                 cloudflare/unsee:v0.8.0

如果想在 Swarm 部署 Portainer的话,需要在docker-compose里加入以下声明

...
services:
  agent:
    image: portainer/agent
    volumes:
      - /var/run/docker.sock:/var/run/docker.sock
      - /var/lib/docker/volumes:/var/lib/docker/volumes
    ports:
      - target: 9001
        published: 9001
        protocol: tcp
        mode: host
    networks:
      - net
    deploy:
      mode: global
      placement:
        constraints: [node.platform.os == linux]

  portainer:
    image: portainer/portainer
    command: -H tcp://tasks.agent:9001 --tlsskipverify
    ports:
      - "8000:8000"
    volumes:
      - portainer_data:/data
    networks:
      - net
    deploy:
      mode: replicated
      replicas: 1
      placement:
        constraints: [node.role == manager]
...
# 使用以下命令更新
docker stack deploy -c docker-compose.yml mon

部署一个服务,然后使用Prometheus监控自动发现

cat test-compose.yml
version: "3.3"

networks:
  net:
    driver: overlay
    attachable: true
  mon_net:
    external: true

services:

  mongo:
    image: healthcheck/mongo:latest
    networks:
      - net
    deploy:
      mode: replicated
      replicas: 1
      placement:
        constraints:
          - node.role != manager

  mongo-exporter:
    image: forekshub/percona-mongodb-exporter:latest
    networks:
      - net
      - mon_net
    ports:
      - "9216:9216"
    environment:
      - MONGODB_URL=mongodb://mongo:27017
    deploy:
      mode: replicated
      replicas: 1
      placement:
        constraints:
          - node.role == manager

# 部署
docker stack deploy -c test-compose.yml mongo

# 查看 stack 列表
docker stack ls
NAME                SERVICES            ORCHESTRATOR
mon                 10                  Swarm
mongo               2                   Swarm

# 查看 service 列表
docker service ls|grep mongo
o20avg5k0lqb        mongo_mongo            replicated          1/1                 healthcheck/mongo:latest
6atp7sl2byeu        mongo_mongo-exporter   replicated          1/1                 forekshub/percona-mongodb-exporter:latest      *:9216->9216/tcp

# 在其中一个节点查看mongo是否部署成功
docker ps -a|grep mongo
102b337589aa        healthcheck/mongo:latest                       "docker-entrypoint.s…"   18 minutes ago      Up 18 minutes (healthy)   27017/tcp                mongo_mongo.1.whn157ky895refdogo4s3imrw

 

总结
至此对于swarm集群的监控已经讲完了,对于swarm集群里,已经植入了一些简单的rules,关于Alertmanager与Rules的具体配置,具体可以参考官方网站。

 

欢迎大家关注我的公众号ID:k8stech


文章转自公众号:Kubernetes技术栈

标签
已于2022-7-4 17:07:18修改
收藏
回复
举报
回复