Kubernetes Prometheus钉钉告警
系统环境
Prometheus 版本: 2.36.0
Kubernetes 版本: 1.23.0
dingtalk 版本: 2.1.0
prometheus-webhook-dingtalk是一个开源的钉钉告警的插件,目前最新版停留于v2.1.0
配置钉钉机器人
使用钉钉进行添加群机器人
dingtalk部署
[root@k8s-master-01 monitoring]# vim dingtalk.yaml
apiVersion: v1
kind: Service
metadata:
name: dingtalk
namespace: monitoring
spec:
selector:
app: dingtalk
ports:
- name: http
protocol: TCP
port: 8060
targetPort: 8060
---
apiVersion: apps/v1
kind: Deployment
metadata:
name: dingtalk
namespace: monitoring
labels:
app: dingtalk
spec:
replicas: 1
strategy:
rollingUpdate:
maxSurge: 25%
maxUnavailable: 25%
type: RollingUpdate
selector:
matchLabels:
app: dingtalk
template:
metadata:
labels:
app: dingtalk
spec:
restartPolicy: "Always"
containers:
- name: dingtalk
image: timonwong/prometheus-webhook-dingtalk:v2.1.0
imagePullPolicy: "IfNotPresent"
volumeMounts:
- name: dingtalk-conf
mountPath: /etc/prometheus-webhook-dingtalk/
resources:
limits:
cpu: "400m"
memory: "500Mi"
requests:
cpu: "100m"
memory: "100Mi"
ports:
- containerPort: 8060
name: http
protocol: TCP
readinessProbe:
failureThreshold: 3
periodSeconds: 5
initialDelaySeconds: 30
successThreshold: 1
tcpSocket:
port: 8060
livenessProbe:
tcpSocket:
port: 8060
initialDelaySeconds: 30
periodSeconds: 10
volumes:
- name: dingtalk-conf
configMap:
name: dingtalk-cm
钉钉告警模板
如部分值显示为空,要根据实际情况进行修改
[root@k8s-master-01 monitoring]# vim dingtalk-configmap.yaml
apiVersion: v1
kind: ConfigMap
metadata:
name: dingtalk-cm
namespace: monitoring
data:
config.yml: |-
templates:
- /etc/prometheus-webhook-dingtalk/dingding.tmpl
targets:
webhook:
url: https://oapi.dingtalk.com/robot/send?access_token=xxxxxxxxxxxxxxxxxxxxx
secret: "****************************"
message:
text: '{{ template "dingtalk.to.message" . }}'
dingding.tmpl: |-
{{ define "dingtalk.to.message" }}
{{- if gt (len .Alerts.Firing) 0 -}}
{{- range $index, $alert := .Alerts -}}
{{- if eq $index 0 }}
========= 监控报警 ========={{ "\n" }}
**告警状态:** {{ .Status }}{{ "\n" }}
**告警级别:** {{ .Labels.level }}{{ "\n" }}
**告警类型:** {{ $alert.Labels.alertname }}{{ "\n" }}
**故障主机:** {{ $alert.Labels.instance }}{{ "\n" }}
**告警主题:** {{ $alert.Annotations.summary }}{{ "\n" }}
**告警详情:** {{ $alert.Annotations.message }}{{ $alert.Annotations.description}};{{ "\n" }}
**触发阀值:** {{ .Annotations.value }}{{ "\n" }}
**故障时间:** {{ ($alert.StartsAt.Add 28800e9).Format "2006-01-02 15:04:05" }}{{ "\n" }}
========= = end = =========
{{- end }}
{{- end }}
{{- end }}
{{- if gt (len .Alerts.Resolved) 0 -}}
{{- range $index, $alert := .Alerts -}}
{{- if eq $index 0 }}
========= 告警恢复 ========={{ "\n" }}
**告警类型:** {{ .Labels.alertname }}{{ "\n" }}
**告警状态:** {{ .Status }}{{ "\n" }}
**告警主题:** {{ $alert.Annotations.summary }}{{ "\n" }}
**告警详情:** {{ $alert.Annotations.message }}{{ $alert.Annotations.description}};{{ "\n" }}
**故障时间:** {{ ($alert.StartsAt.Add 28800e9).Format "2006-01-02 15:04:05" }}{{ "\n" }}
**恢复时间:** {{ ($alert.EndsAt.Add 28800e9).Format "2006-01-02 15:04:05" }}{{ "\n" }}
{{- if gt (len $alert.Labels.instance) 0 }}
**实例信息:** {{ $alert.Labels.instance }}{{ "\n" }}
{{- end }}
========= = end = =========
{{- end }}
{{- end }}
{{- end }}
{{- end }}
配置钉钉webhook
[root@k8s-master-01 monitoring]# vim alertmanager-config.yaml
...
route:
group_by: ['env','instance','type','group','job','alertname','cluster']
group_wait: 10s
group_interval: 2m
repeat_interval: 10m
receiver: 'webhook'
routes:
- receiver: 'email'
match:
severity: info
receivers:
- name: 'webhook'
webhook_configs:
- send_resolved: true
url: 'http://dingtalk:8060/dingtalk/webhook/send'
...
部署并检查
[root@k8s-master-01 monitoring]# kubectl apply -f alertmanager-config.yaml
[root@k8s-master-01 monitoring]# kubectl apply -f dingtalk-configmap.yaml
[root@k8s-master-01 monitoring]# kubectl apply -f dingtalk.yaml
#查看是否部署成功
[root@k8s-master-01 monitoring]# kubectl get pods -n monitoring | grep dingtalk
dingtalk-5869c868b7-mmvf7 1/1 Running 0 4min
dingtalk部署成功后,重新部署alertmanager就行了,报警信息如下
版权声明:
本站所有文章除特别声明外,均采用 CC BY-NC-SA 4.0 许可协议。转载请注明来自
爱吃可爱多!
喜欢就支持一下吧
打赏
微信
支付宝