Prometheus+Grafana+Alertmanager搭建全方位的监控告警系统 (2)

注意:通过上面命令生成的promtheus-cfg.yaml文件会有一些问题,$1和$2这种变量在文件里没有,需要在k8s的master1节点打开promtheus-cfg.yaml文件,手动把$1和$2这种变量写进文件里,promtheus-cfg.yaml文件需要手动修改部分如下:

22行的replacement: ':9100'变成replacement: '${1}:9100' 42行的replacement: /api/v1/nodes//proxy/metrics/cadvisor变成 replacement: /api/v1/nodes/${1}/proxy/metrics/cadvisor 73行的replacement: 变成replacement: $1:$2

给prometheus pod 创建一个service

cat > prometheus-svc.yaml << EOF --- apiVersion: v1 kind: Service metadata: name: prometheus namespace: monitor-sa labels: app: prometheus spec: type: NodePort ports: - port: 9090 targetPort: 9090 protocol: TCP selector: app: prometheus component: server EOF #查看service在物理机映射的端口 kubectl get svc -n monitor-sa #访问prometheus web ui 界面 :30426/graph #点击页面的Status->Targets,可看到如下,说明我们配置的服务发现可以正常采集数据 prometheus热更新

#为了每次修改配置文件可以热加载prometheus,也就是不停止prometheus,就可以使配置生效,如修改prometheus-cfg.yaml,想要使配置生效可用如下热加载命令:
curl -X POST :9090/-/reload

#10.244.1.66是prometheus的pod的ip地址

#热加载速度比较慢,可以暴力重启prometheus,如修改上面的prometheus-cfg.yaml文件之后,可执行如下强制删除:

kubectl delete -f prometheus-cfg.yaml

kubectl delete -f prometheus-deploy.yaml

然后再通过apply更新:

kubectl apply -f prometheus-cfg.yaml

kubectl apply -f prometheus-deploy.yaml

注意:

线上最好热加载,暴力删除可能造成监控数据的丢失

Grafana安装和配置

下载安装Grafana需要的镜像

上传heapster-grafana-amd64_v5_0_4.tar.gz镜像到k8s的各个master节点和k8s的各个node节点,然后在各个节点手动解压:
docker load -i heapster-grafana-amd64_v5_0_4.tar.gz

镜像所在的百度网盘地址如下:

链接:https://pan.baidu.com/s/1TmVGKxde_cEYrbjiETboEA 提取码:052u 在k8s的master节点创建grafana.yaml cat >grafana.yaml << EOF apiVersion: apps/v1 kind: Deployment metadata: name: monitoring-grafana namespace: kube-system spec: replicas: 1 selector: matchLabels: task: monitoring k8s-app: grafana template: metadata: labels: task: monitoring k8s-app: grafana spec: containers: - name: grafana image: k8s.gcr.io/heapster-grafana-amd64:v5.0.4 ports: - containerPort: 3000 protocol: TCP volumeMounts: - mountPath: /etc/ssl/certs name: ca-certificates readOnly: true - mountPath: /var name: grafana-storage env: - name: INFLUXDB_HOST value: monitoring-influxdb - name: GF_SERVER_HTTP_PORT value: "3000" # The following env variables are required to make Grafana accessible via # the kubernetes api-server proxy. On production clusters, we recommend # removing these env variables, setup auth for grafana, and expose the grafana # service using a LoadBalancer or a public IP. - name: GF_AUTH_BASIC_ENABLED value: "false" - name: GF_AUTH_ANONYMOUS_ENABLED value: "true" - name: GF_AUTH_ANONYMOUS_ORG_ROLE value: Admin - name: GF_SERVER_ROOT_URL # If you're only using the API Server proxy, set this value instead: # value: /api/v1/namespaces/kube-system/services/monitoring-grafana/proxy value: / volumes: - name: ca-certificates hostPath: path: /etc/ssl/certs - name: grafana-storage emptyDir: {} --- apiVersion: v1 kind: Service metadata: labels: # For use as a Cluster add-on (https://github.com/kubernetes/kubernetes/tree/master/cluster/addons) # If you are NOT using this as an addon, you should comment out this line. kubernetes.io/cluster-service: 'true' kubernetes.io/name: monitoring-grafana name: monitoring-grafana namespace: kube-system spec: # In a production setup, we recommend accessing Grafana through an external Loadbalancer # or through a public IP. # type: LoadBalancer # You could also use NodePort to expose the service at a randomly-generated port # type: NodePort ports: - port: 80 targetPort: 3000 selector: k8s-app: grafana type: NodePort EOF

内容版权声明:除非注明,否则皆为本站原创文章。

转载注明出处:https://www.heiqu.com/wpgzwz.html