**系统环境**:
- Prometheus Operator版本: (Prometheus Operator更名为Kube-Prometheus,且版本变为:0.6.0 )
- Kubernetes 版本: 1.15.4
# Pronrtheus监控Traefik
## Traefik 配置文件设置 Prometheus
要监控 Traefik 控制器,首先要控制 Traeik 将 Metrics 数据暴露出来,这需要在配置文件中加入下面配置
```bash
[metrics]
[metrics.prometheus]
entryPoint = "traefik"
buckets = [0.1,0.3,1.2,5.0]
```
安装 Traefik 时候已经将配置文件外挂到 Kubernetes ConfigMap 中,详情可以参考 [Kubernetes 部署 Traefik Ingress](https://www.youqiqi.cn/archives/traefikv2x%E9%83%A8%E7%BD%B2) ,集群中将 Traefik 配置文件挂载到 Kubernetes ConfigMap 中,可以用 “kubectl etid” 命令编辑 Traefik 配置文件,加上 Prometheus 配置
```bash
[root@k8s01 ~]# kubectl edit ConfigMap traefik-config -n kube-system
apiVersion: v1
data:
traefik.toml: |
# traefik.toml
debug = true
InsecureSkipVerify = true
defaultEntryPoints = ["http","https"]
[entryPoints]
[entryPoints.http]
address = ":80"
compress = true
[entryPoints.https]
address = ":443"
compress = true
[entryPoints.https.tls]
[[entryPoints.https.tls.certificates]]
CertFile = "/ssl/tls.crt"
KeyFile = "/ssl/tls.key"
[entryPoints.traefik]
address = ":8080"
[kubernetes]
[traefikLog]
format = "json"
#filePath = "/data/traefik.log"
[accessLog]
#filePath = "/data/access.log"
format = "json"
[accessLog.filters]
retryAttempts = true
minDuration = "10ms"
[accessLog.fields]
defaultMode = "keep"
[accessLog.fields.names]
"ClientUsername" = "drop"
[accessLog.fields.headers]
defaultMode = "keep"
[accessLog.fields.headers.names]
"User-Agent" = "redact"
"Authorization" = "drop"
"Content-Type" = "keep"
[api]
entryPoint = "traefik"
dashboard = true
[metrics]
[metrics.prometheus]
entryPoint = "traefik"
buckets = [0.1,0.3,1.2,5.0]
```
## Traefik Service 设置标签
Prometheus Operator 是通过 Label 匹配的,需要提前设置 Service 贴上“k8s-app: traefik-ingress”标签
### 查看 Traefik Service
```bash
[root@k8s01 ~]# kubectl get service -n kube-system
kube-dns ClusterIP 10.10.0.10 <none> 53/UDP,53/TCP,9153/TCP 79d
kubelet ClusterIP None <none> 10250/TCP 35d
traefik-ingress-service ClusterIP 10.10.114.105 <none> 80/TCP,443/TCP,8080/TCP 56d
```
### 编辑该 Service 设置 Label
编辑 Traefik Service
```bash
[root@k8s01 ~]# kubectl edit service traefik-ingress-service -n kube-system
#设置 Label “k8s-app: traefik-ingress”
apiVersion: v1
kind: Service
metadata:
creationTimestamp: "2019-04-15T05:06:41Z"
name: traefik-ingress-service
namespace: kube-system
resourceVersion: "85575"
selfLink: /api/v1/namespaces/kube-system/services/traefik-ingress-service
uid: 4172b4df-5f3c-11e9-9287-000c29d98697
labels:
k8s-app: traefik-ingress #---增加标签 “k8s-app: traefik-ingress”
spec:
clusterIP: 10.10.114.105
ports:
- name: http
port: 80
protocol: TCP
targetPort: 80
- name: https
port: 443
protocol: TCP
targetPort: 443
- name: admin #---Prometheus metrics 数据是通过8080端口暴露的
port: 8080
protocol: TCP
targetPort: 8080
selector:
k8s-app: traefik-ingress-lb
sessionAffinity: None
type: ClusterIP
status:
loadBalancer: {}
```
## Prometheus Operator 配置监控规则
配置服务监控资源,用于监控 Traefik 控制器:
**traefik-monitor.yaml**
```bash
[root@k8s01 ~]# vim traefik-monitor.yaml
apiVersion: monitoring.coreos.com/v1
kind: ServiceMonitor
metadata:
name: traefik-ingress
namespace: monitoring
labels:
k8s-app: traefik-ingress
spec:
jobLabel: k8s-app
endpoints:
- port: admin #---设置为traefik 8080端口名称 admin
interval: 30s
selector:
matchLabels:
k8s-app: traefik-ingress
namespaceSelector:
matchNames:
- kube-system
[root@k8s01 ~]# kubectl apply -f traefik-monitor.yaml
```
## 查看 Prometheus 规则
打开 Prometheus UI,查看 Prometheus 规则,可以看到 traefik 数据已经存在(创建完等待一会)

## Grafana 引入仪表盘
打开 Grafana,在其中引入编号“4475”的仪表盘


然后就可以看到仪表盘(**如果没有数据,请提前通过 Traefik Ingress 访问其配置的域名,刷新出一些数据,然后调整小时间范围**)

# Prometheus Operator监控Nginx
保证metrics是通的
## 创建service
先创建ingres-nginx 对接的service,因为下面servicemonitor会通过service来查询对应的pod,用到了service的服务发现
```bash
[root@k8s01 ~]# vim nginx-server.yaml
apiVersion: v1
kind: Service
metadata:
name: ingress-nginx
namespace: ingress-nginx #与ingress所在namespace一致
labels:
app: ingress-nginx #这个label记下,servicemonitor会用到
spec:
type: ClusterIP
ports:
- name: http
port: 80
targetPort: 80
protocol: TCP
- name: https
port: 443
targetPort: 443
protocol: TCP
- name: metrics # 三个端口名称与daemonset中保持一致
port: 10254
targetPort: 10254
protocol: TCP
selector:
app.kubernetes.io/name: ingress-nginx #这个和daemonset中pod的label一致
```
## 创建ServiceMonitor
servicenitor中定义了需要监控的对象,创建成功后,在prometheus 的status-configuration 和status - target能查看到ingress的配置与状态
```bash
[root@k8s01 ~]# vim serviceminitor.yaml
apiVersion: monitoring.coreos.com/v1
kind: ServiceMonitor
metadata:
labels:
k8s-app: ingress-nginx # 与service保持一致
name: nginx-ingress-controller-metrics
namespace: ingress-nginx # 最好与被监控对应放一起,也可在monitoring下
spec:
endpoints:
- interval: 15s
port: metrics # 与service 暴露监控的端口名称一致
jobLabel: k8s-app
namespaceSelector:
matchNames:
- ingress-nginx # 监控对象所在namespace
selector:
matchLabels:
app: ingress-nginx # 与service的label一致,不是service的selector
```
## 验证
prometheus页面查看status -- target,如果能看到configration,target里面没有,可能权限问题

## Grafana 引入仪表盘
grafana页面点+, import,导入nginx.json
[nginx.json](https://github.com/kubernetes/ingress-nginx/blob/dfa7f10fc9691a3be90fd30cb458b64b617ef440/deploy/grafana/dashboards/nginx.json)
导完过后,已经获取到数据,现在没有应用使用ingress负载,很多地方还是空的

可以创建一个ingress规则,然后调整时间间隔即可

Prometheus Operator监控ingress控制器