1. 쿠버네티스 모니터링
이번 장에서는 모니터링을 위해 Dashboard, Prometheus+Grafana 를 직접 설치하여 실습해 봅니다.
1.1 Kubernetes Dashboard
대시보드는 웹기반으로 쿠버네티스를 모니터링하고 관리 할 수 있습니다
(1) 대시보드 yaml 파일을 다운로드 받습니다.
다운로드 받은 파일에서 서비스 타입을 NodePort나 LoadBalancer로 수정합니다.
12/00-dashboard-install.txt
curl -LO https://raw.githubusercontent.com/kubernetes/dashboard/v2.7.0/aio/deploy/recommended.yaml
(2) 수정된 yaml 파일을 적용하여 대시보드를 생성합니다.
# kubectl apply -f recommended.yaml namespace/kubernetes-dashboard created serviceaccount/kubernetes-dashboard created service/kubernetes-dashboard created secret/kubernetes-dashboard-certs created secret/kubernetes-dashboard-csrf created secret/kubernetes-dashboard-key-holder created configmap/kubernetes-dashboard-settings created role.rbac.authorization.k8s.io/kubernetes-dashboard created clusterrole.rbac.authorization.k8s.io/kubernetes-dashboard created rolebinding.rbac.authorization.k8s.io/kubernetes-dashboard created clusterrolebinding.rbac.authorization.k8s.io/kubernetes-dashboard created deployment.apps/kubernetes-dashboard created service/dashboard-metrics-scraper created deployment.apps/dashboard-metrics-scraper created
(3) 로그인 시 필요한 토큰을 생성하여 접속합니다.
12/01-dashboard-user.yaml
--- apiVersion: v1 kind: ServiceAccount metadata: name: admin-user namespace: kubernetes-dashboard --- apiVersion: rbac.authorization.k8s.io/v1 kind: ClusterRoleBinding metadata: name: admin-user roleRef: apiGroup: rbac.authorization.k8s.io kind: ClusterRole name: cluster-admin subjects: - kind: ServiceAccount name: admin-user namespace: kubernetes-dashboard
kubectl create -f 01-dashboard-user.yaml serviceaccount/admin-user created kclusterrolebinding.rbac.authorization.k8s.io/admin-user created 토큰을 복사하여 로그인할 때 사용합니다. cd eyJhbGciOiJSUzI1NiIsImtpZCI6ImNlZ015NjBDby15R1c4NVlIMi1vTGVVQXhPdGVzeEk5cEV0NGJiWl92RFEifQ.eyJhdWQiOlsiaHR0cHM6Ly9rdWJlcm5ldGVzLmRlZmF1bHQuc3ZjLmNsdXN0ZXIubG9jYWwiXSwiZXhwIjoxNjk0Njc3NDgxLCJpYXQiOjE2OTQ2NzM4ODEsImlzcyI6Imh0dHBzOi8va3ViZXJuZXRlcy5kZWZhdWx0LnN2Yy5jbHVzdGVyLmxvY2FsIiwia3ViZXJuZXRlcy5pbyI6eyJuYW1lc3BhY2UiOiJrdWJlcm5ldGVzLWRhc2hib2FyZCIsInNlcnZpY2VhY2NvdW50Ijp7Im5hbWUiOiJhZG1pbi11c2VyIiwidWlkIjoiYTRjOGRmMzQtNGY4Yy00OGE5LWJmMjEtYTgyZDliODhhYzAwIn19LCJuYmYiOjE2OTQ2NzM4ODEsInN1YiI6InN5c3RlbTpzZXJ2aWNlYWNjb3VudDprdWJlcm5ldGVzLWRhc2hib2FyZDphZG1pbi11c2VyIn0.hF6NmSdHK-_nysVWZkQdLZZPtKDzPKq1VUfVKn3TjP1d44QfFoBYxVvcbC9mjcphPT0ggA-NKFcZalUN5a8X04lmk77Vel7PFyrXy4Q7WNx2OxiQN88U7F291pyptif0_sk44pZA58MMLYKEq9WKFGnKufyfZJeDsLcFq2YwkTtk4DYrJ47KeibKRnXf3QJ6Wcw4e61x0vBy_VU_vpCPkD4lC9bQKVRRy-Y-thevbAfE5FRthvADsNodBdIEZReXMGnDkvog_LL0okSGz6wKfWE8CpvEGhZQpydlagJGCcs6A15_wr05RyZBNtaoxOdFTDieeYEqKcio0qUJNTGOhA kubectl get svc -A NAMESPACE NAME TYPE CLUSTER-IP EXTERNAL-IP PORT(S) AGE kubernetes-dashboard kubernetes-dashboard NodePort 10.109.126.84 <none> 443:31963/TCP 8m35s nodeip:nodeport로 접근하여 위에서 확이한 token은 이용하여 접속합니다 아니면 edit로 loadbalancer로 변경합니다.
(4) 브라우저를 통해 대시보드에 접속합니다.
1.2 Prometheus와 Grafana
프로메테우스와 그라파나는 모니터링에 특화된 솔루션입니다.
실제 운영 환경 모니터링은 그라파나를 가장 많이 사용합니다.
프로메테우스로 매트릭을 수집하여 저장하고 그라파나에서 필요한 데이터를 불러와 대시보드 형식으로 표현 합니다.
쿠버네티스와 노드의 매트릭 수집을 위해 kube-state-metric과 nodeexporter를 사용합니다.
프로메테우스 매트릭은 클러스터 규모가 커질수록 수집 대상 데이터도 많아지고 매트릭정보도 많아지므로 트래픽과 스토리지를 잘 설계해야 합니다.
(1) Prometheus&Grafana 설치
12/02-monitoring-install.txt
git clone https://github.com/prometheus-operator/kube-prometheus.git cd kube-prometheus/ kubectl create -f manifests/setup customresourcedefinition.apiextensions.k8s.io/alertmanagerconfigs.monitoring.coreos.com created customresourcedefinition.apiextensions.k8s.io/alertmanagers.monitoring.coreos.com created customresourcedefinition.apiextensions.k8s.io/podmonitors.monitoring.coreos.com created customresourcedefinition.apiextensions.k8s.io/probes.monitoring.coreos.com created customresourcedefinition.apiextensions.k8s.io/prometheuses.monitoring.coreos.com created customresourcedefinition.apiextensions.k8s.io/prometheusagents.monitoring.coreos.com created customresourcedefinition.apiextensions.k8s.io/prometheusrules.monitoring.coreos.com created customresourcedefinition.apiextensions.k8s.io/scrapeconfigs.monitoring.coreos.com created customresourcedefinition.apiextensions.k8s.io/servicemonitors.monitoring.coreos.com created customresourcedefinition.apiextensions.k8s.io/thanosrulers.monitoring.coreos.com created namespace/monitoring created 프로메테우스 servicemonitor 라는 crd가 설치되기 까지 기다리는 명령 "No resources found" 가 뜨면 성공 until kubectl get servicemonitors --all-namespaces ; do date; sleep 1; echo ""; done kubectl create -f manifests/ alertmanager.monitoring.coreos.com/main created networkpolicy.networking.k8s.io/alertmanager-main created poddisruptionbudget.policy/alertmanager-main created .. 생략 .. 설치에 시간이 소요될 수 있습니다. 전부 배포된 모습입니다. kubectl get all -n monitoring NAME READY STATUS RESTARTS AGE pod/alertmanager-main-0 2/2 Running 0 45s pod/alertmanager-main-1 2/2 Running 0 45s pod/alertmanager-main-2 2/2 Running 0 45s pod/blackbox-exporter-6cfc4bffb6-srf94 3/3 Running 0 51s pod/grafana-748964b847-rwrk5 1/1 Running 0 49s pod/kube-state-metrics-6b4d48dcb4-jb7qt 3/3 Running 0 49s pod/node-exporter-5v9lk 2/2 Running 0 48s pod/node-exporter-9b7md 2/2 Running 0 48s pod/node-exporter-bmdpz 2/2 Running 0 48s pod/prometheus-adapter-79c588b474-72k7p 1/1 Running 0 47s pod/prometheus-adapter-79c588b474-nq4pb 1/1 Running 0 47s pod/prometheus-k8s-0 2/2 Running 0 44s pod/prometheus-k8s-1 2/2 Running 0 44s pod/prometheus-operator-68f6c79f9d-x4jjm 2/2 Running 0 47s NAME TYPE CLUSTER-IP EXTERNAL-IP PORT(S) AGE service/alertmanager-main ClusterIP 10.105.88.232 <none> 9093/TCP,8080/TCP 51s service/alertmanager-operated ClusterIP None <none> 9093/TCP,9094/TCP,9094/UDP 45s service/blackbox-exporter ClusterIP 10.96.11.27 <none> 9115/TCP,19115/TCP 51s service/grafana ClusterIP 10.101.206.55 <none> 3000/TCP 50s service/kube-state-metrics ClusterIP None <none> 8443/TCP,9443/TCP 49s service/node-exporter ClusterIP None <none> 9100/TCP 49s service/prometheus-adapter ClusterIP 10.104.99.31 <none> 443/TCP 48s service/prometheus-k8s ClusterIP 10.97.36.1 <none> 9090/TCP,8080/TCP 48s service/prometheus-operated ClusterIP None <none> 9090/TCP 44s service/prometheus-operator ClusterIP None <none> 8443/TCP 47s NAME DESIRED CURRENT READY UP-TO-DATE AVAILABLE NODE SELECTOR AGE daemonset.apps/node-exporter 3 3 3 3 3 kubernetes.io/os=linux 49s NAME READY UP-TO-DATE AVAILABLE AGE deployment.apps/blackbox-exporter 1/1 1 1 51s deployment.apps/grafana 1/1 1 1 50s deployment.apps/kube-state-metrics 1/1 1 1 49s deployment.apps/prometheus-adapter 2/2 2 2 48s deployment.apps/prometheus-operator 1/1 1 1 47s NAME DESIRED CURRENT READY AGE replicaset.apps/blackbox-exporter-6cfc4bffb6 1 1 1 51s replicaset.apps/grafana-748964b847 1 1 1 50s replicaset.apps/kube-state-metrics-6b4d48dcb4 1 1 1 49s replicaset.apps/prometheus-adapter-79c588b474 2 2 2 48s replicaset.apps/prometheus-operator-68f6c79f9d 1 1 1 47s NAME READY AGE statefulset.apps/alertmanager-main 3/3 45s statefulset.apps/prometheus-k8s 2/2 44s
(2) Grafana, Prometheus 의 서비스 유형을 LoadBalancer로 변경하여 접속합니다.
kubectl get svc -n monitoring NAME TYPE CLUSTER-IP EXTERNAL-IP PORT(S) AGE alertmanager-main ClusterIP 10.97.153.207 <none> 9093/TCP,8080/TCP 87s alertmanager-operated ClusterIP None <none> 9093/TCP,9094/TCP,9094/UDP 82s blackbox-exporter ClusterIP 10.110.37.93 <none> 9115/TCP,19115/TCP 87s grafana ClusterIP 10.110.57.39 <none> 3000/TCP 86s kube-state-metrics ClusterIP None <none> 8443/TCP,9443/TCP 85s node-exporter ClusterIP None <none> 9100/TCP 85s prometheus-adapter ClusterIP 10.101.228.1 <none> 443/TCP 84s prometheus-k8s ClusterIP 10.108.23.32 <none> 9090/TCP,8080/TCP 84s prometheus-operated ClusterIP None <none> 9090/TCP 80s prometheus-operator ClusterIP None <none> 8443/TCP 83s kubectl -n monitoring edit svc grafana spec: type: LoadBalancer kubectl -n monitoring edit svc prometheus-k8s spec: type: LoadBalancer grafana와 prometheus-k8s 의 netpol을 삭제합니다. (추후 다룰 예정) kubectl -n monitoring delete netpol grafana prometheus-k8s
(3) LoadBalancer IP 확인 하여 브라우저에서 3000번 포트로 grafana 접속합니다.
admin / admin 입력하여 접속
(4) 13770 ID 입력하여 Dashboard Import https://grafana.com/grafana/dashboards/13770-1-kubernetes-all-in-one-cluster-monitoring-kr/
Datasource는 Prometheus로 설정
(5) 대시보드 화면 확인
(6) LoadBalancer IP 확인 하여 브라우저에서 9090번 포트로 Prometheus 접속합니다.
Status → Target