回复
Prometheus监控Docker Swarm集群(三)
icegoblin
发布于 2022-7-4 17:07
浏览
0收藏
监控Swarm集群
OK,Swarm集群初始化已经完成,基于cAdvisor+InfluxDB+Grafana的yaml脚本
cat docker-compose-monitor.yml
version: '3'
services:
influx:
image: influxdb
volumes:
- influx:/var/lib/influxdb
deploy:
replicas: 1
placement:
constraints:
- node.role == manager
grafana:
image: grafana/grafana
ports:
- 0.0.0.0:80:3000
volumes:
- grafana:/var/lib/grafana
depends_on:
- influx
deploy:
replicas: 1
placement:
constraints:
- node.role == manager
cadvisor:
image: google/cadvisor
hostname: '{{.Node.Hostname}}'
command: -logtostderr -docker_only -storage_driver=influxdb -storage_driver_db=cadvisor -storage_driver_host=influx:8086
volumes:
- /:/rootfs:ro
- /var/run:/var/run:rw
- /sys:/sys:ro
- /var/lib/docker/:/var/lib/docker:ro
depends_on:
- influx
deploy:
mode: global
volumes:
influx:
driver: local
grafana:
driver: local
我们这里只讲第二种,基于cAdvisor+Prometheus+Grafana的方案。
git clone https://github.com/cyancow/swarmprom.git
cd swarmprom
ADMIN_USER=admin \
ADMIN_PASSWORD=admin \
SLACK_URL=https://hooks.slack.com/services/9935226 \
SLACK_CHANNEL=devops-alerts \
SLACK_USER=alertmanager \
docker stack deploy -c docker-compose.yml mon
# output
Creating network mon_net
Creating config mon_caddy_config
Creating config mon_dockerd_config
Creating config mon_node_rules
Creating config mon_task_rules
Creating service mon_prometheus
Creating service mon_caddy
Creating service mon_dockerd-exporter
Creating service mon_cadvisor
Creating service mon_grafana
Creating service mon_alertmanager
Creating service mon_unsee
Creating service mon_node-exporter
# 查看部署的stack
docker stack ls
NAME SERVICES ORCHESTRATOR
mon 8 Swarm
# 查看部署的service
docker service ls
ID NAME MODE REPLICAS IMAGE PORTS
xnkq61woc3ag mon_alertmanager replicated 1/1 stefanprodan/swarmprom-alertmanager:v0.14.0
tzxe317tffgl mon_caddy replicated 1/1 stefanprodan/caddy:latest *:3000->3000/tcp, *:9090->9090/tcp, *:9093-9094->9093-9094/tcp
06rv2rj9oxbo mon_cadvisor global 3/3 google/cadvisor:latest
ropkluyyxora mon_dockerd-exporter global 3/3 stefanprodan/caddy:latest
29ygw9r4a92c mon_grafana replicated 1/1 stefanprodan/swarmprom-grafana:5.3.4
whqtwwmfvdjl mon_node-exporter global 3/3 stefanprodan/swarmprom-node-exporter:v0.16.0
xv19nuesymol mon_prometheus replicated 1/1 stefanprodan/swarmprom-prometheus:v2.5.0
ia2g1ayhzjf6 mon_unsee replicated 1/1 cloudflare/unsee:v0.8.0
如果想在 Swarm 部署 Portainer的话,需要在docker-compose里加入以下声明
...
services:
agent:
image: portainer/agent
volumes:
- /var/run/docker.sock:/var/run/docker.sock
- /var/lib/docker/volumes:/var/lib/docker/volumes
ports:
- target: 9001
published: 9001
protocol: tcp
mode: host
networks:
- net
deploy:
mode: global
placement:
constraints: [node.platform.os == linux]
portainer:
image: portainer/portainer
command: -H tcp://tasks.agent:9001 --tlsskipverify
ports:
- "8000:8000"
volumes:
- portainer_data:/data
networks:
- net
deploy:
mode: replicated
replicas: 1
placement:
constraints: [node.role == manager]
...
# 使用以下命令更新
docker stack deploy -c docker-compose.yml mon
部署一个服务,然后使用Prometheus监控自动发现
cat test-compose.yml
version: "3.3"
networks:
net:
driver: overlay
attachable: true
mon_net:
external: true
services:
mongo:
image: healthcheck/mongo:latest
networks:
- net
deploy:
mode: replicated
replicas: 1
placement:
constraints:
- node.role != manager
mongo-exporter:
image: forekshub/percona-mongodb-exporter:latest
networks:
- net
- mon_net
ports:
- "9216:9216"
environment:
- MONGODB_URL=mongodb://mongo:27017
deploy:
mode: replicated
replicas: 1
placement:
constraints:
- node.role == manager
# 部署
docker stack deploy -c test-compose.yml mongo
# 查看 stack 列表
docker stack ls
NAME SERVICES ORCHESTRATOR
mon 10 Swarm
mongo 2 Swarm
# 查看 service 列表
docker service ls|grep mongo
o20avg5k0lqb mongo_mongo replicated 1/1 healthcheck/mongo:latest
6atp7sl2byeu mongo_mongo-exporter replicated 1/1 forekshub/percona-mongodb-exporter:latest *:9216->9216/tcp
# 在其中一个节点查看mongo是否部署成功
docker ps -a|grep mongo
102b337589aa healthcheck/mongo:latest "docker-entrypoint.s…" 18 minutes ago Up 18 minutes (healthy) 27017/tcp mongo_mongo.1.whn157ky895refdogo4s3imrw
总结
至此对于swarm集群的监控已经讲完了,对于swarm集群里,已经植入了一些简单的rules,关于Alertmanager与Rules的具体配置,具体可以参考官方网站。
欢迎大家关注我的公众号ID:k8stech
文章转自公众号:Kubernetes技术栈
标签
已于2022-7-4 17:07:18修改
赞
收藏
回复
相关推荐