ArgoCD GitOps 實作:從安裝到完整 CD 流程

這篇記錄把 ArgoCD 裝到 k3s cluster, 建立 ArgoCD Application, 讓整個 CD 流程從「GitHub Actions 直接 kubectl apply」換成「git 是 source of truth, ArgoCD 負責 sync」。

Push-based vs Pull-based CD

flowchart TD
    subgraph Push-based
        P1[Developer push] --> P2[GitHub Actions]
        P2 -->|kubectl apply + K8s credentials| P3[K8s Cluster]
    end

    subgraph Pull-based GitOps
        G1[Developer push] --> G2[GitHub Actions]
        G2 -->|push image + update YAML| G3[Git Repo]
        G3 -->|ArgoCD polls every 3min| G4[ArgoCD in Cluster]
        G4 -->|kubectl apply internal| G5[K8s Cluster]
    end
flowchart TD
    subgraph Push-based
        P1[Developer push] --> P2[GitHub Actions]
        P2 -->|kubectl apply + K8s credentials| P3[K8s Cluster]
    end

    subgraph Pull-based GitOps
        G1[Developer push] --> G2[GitHub Actions]
        G2 -->|push image + update YAML| G3[Git Repo]
        G3 -->|ArgoCD polls every 3min| G4[ArgoCD in Cluster]
        G4 -->|kubectl apply internal| G5[K8s Cluster]
    end
flowchart TD
    subgraph Push-based
        P1[Developer push] --> P2[GitHub Actions]
        P2 -->|kubectl apply + K8s credentials| P3[K8s Cluster]
    end

    subgraph Pull-based GitOps
        G1[Developer push] --> G2[GitHub Actions]
        G2 -->|push image + update YAML| G3[Git Repo]
        G3 -->|ArgoCD polls every 3min| G4[ArgoCD in Cluster]
        G4 -->|kubectl apply internal| G5[K8s Cluster]
    end
Push-based Pull-based (ArgoCD)
CI 需要 K8s 權限
Drift 偵測 自動偵測並修正
Rollback 手動跑舊 workflow UI 一鍵或 git revert
可視性 CI log ArgoCD UI 完整 sync 狀態

Push-based 的根本問題是安全邊界模糊——CI runner 持有 K8s credentials, 一旦 credentials 外洩, 攻擊者可以直接操作 cluster。Pull-based 把控制權留在 cluster 內部, CI 只需要寫 git 的權限。

ArgoCD 架構

flowchart LR
    Git["Git Repo
k8s/base/"] -->|clone + render| RS[argocd-repo-server] RS -->|desired state| AC[argocd-application-controller] AC <-->|compare| K8s["k3s Cluster
actual state"] AC -->|diff found: sync| K8s Redis["argocd-redis
cache"] --- AC Server["argocd-server
UI + API"] --- AC
flowchart LR
    Git["Git Repo
k8s/base/"] -->|clone + render| RS[argocd-repo-server] RS -->|desired state| AC[argocd-application-controller] AC <-->|compare| K8s["k3s Cluster
actual state"] AC -->|diff found: sync| K8s Redis["argocd-redis
cache"] --- AC Server["argocd-server
UI + API"] --- AC
flowchart LR
    Git["Git Repo
k8s/base/"] -->|clone + render| RS[argocd-repo-server] RS -->|desired state| AC[argocd-application-controller] AC <-->|compare| K8s["k3s Cluster
actual state"] AC -->|diff found: sync| K8s Redis["argocd-redis
cache"] --- AC Server["argocd-server
UI + API"] --- AC

各組件的職責:

組件 職責
argocd-server UI + API, 操作入口
argocd-repo-server clone git repo, render YAML(Helm/Kustomize/plain)
argocd-application-controller 核心, 每 3 分鐘比對 desired vs actual state
argocd-applicationset-controller 批量管理多個 Application
argocd-dex-server SSO 認證(GitHub OAuth、LDAP)
argocd-redis cache application state, 加速比對
argocd-notifications-controller 發通知(Slack、email)

Render YAML 是指 ArgoCD 把 Helm template 或 Kustomize overlay 展開成合法的 K8s YAML。Plain YAML 不需要 render, 所見即所得。

安裝 ArgoCD

1
2
3
kubectl create namespace argocd
kubectl apply -n argocd -f https://raw.githubusercontent.com/argoproj/argo-cd/stable/manifests/install.yaml
kubectl get pods -n argocd -w

等所有 pod Running 後存取 UI:

1
2
3
4
kubectl port-forward svc/argocd-server -n argocd 8080:443

kubectl -n argocd get secret argocd-initial-admin-secret \
  -o jsonpath="{.data.password}" | base64 -d && echo

瀏覽器開 https://localhost:8080, 帳號 admin

k8s 目錄結構設計

ArgoCD 監控的路徑只能包含要被管理的 YAML。練習用的 test 資源和 production 資源混在同一個目錄, ArgoCD 無法區分:

 1
 2
 3
 4
 5
 6
 7
 8
 9
10
11
12
13
k8s/
  base/            <- ArgoCD managed
    deployment.yaml
    service.yaml
    configmap.yaml
    hpa.yaml
    secret.yaml
  argocd/          <- Application manifest
    application.yaml
  test/            <- excluded from ArgoCD
    test-namespace.yaml
    test-liveness.yaml
    test-probe.yaml

base/ 是 ArgoCD 的 source of truth。test/ 完全排除在 GitOps 流程之外。

Application Manifest as Code

ArgoCD Application 應該在 git 裡, 不是只在 UI 點出來。UI 建立的設定沒有版本控制, 違反 GitOps 原則:

 1
 2
 3
 4
 5
 6
 7
 8
 9
10
11
12
13
14
15
16
17
18
19
# k8s/argocd/application.yaml
apiVersion: argoproj.io/v1alpha1
kind: Application
metadata:
  name: go-api
  namespace: argocd
spec:
  project: default
  source:
    repoURL: https://github.com/your-org/your-repo
    targetRevision: HEAD
    path: k8s/base
  destination:
    server: https://kubernetes.default.svc
    namespace: default
  syncPolicy:
    automated:
      prune: true
      selfHeal: true
Field 意義
namespace: argocd Application 本身住在 argocd namespace, 固定
project: default ArgoCD Project, 用來做 RBAC 和資源隔離
targetRevision: HEAD 永遠跟最新 commit
path: k8s/base 監控這個目錄下的所有 YAML
server: kubernetes.default.svc ArgoCD 所在的 cluster 本身
automated.prune: true git 裡刪掉的資源, cluster 上也刪
automated.selfHeal: true 有人手動改了 cluster, 自動改回 git 的狀態
1
kubectl apply -f k8s/argocd/application.yaml

如果 Application 已經從 UI 建立, kubectl apply 會更新現有的, 以 YAML 為準。之後 UI 只用來觀察。

imagePullSecret:讓 k3s pull Artifact Registry

k3s 是本機 cluster, 沒有 GCP Workload Identity, 需要明確的 credentials 才能 pull private registry 的 image。

用 Service Account key 建立不會過期的 imagePullSecret:

 1
 2
 3
 4
 5
 6
 7
 8
 9
10
gcloud iam service-accounts keys create /tmp/ar-pull-key.json \
  --iam-account=YOUR_SA@YOUR_PROJECT.iam.gserviceaccount.com

kubectl create secret docker-registry ar-secret \
  --docker-server=us-east1-docker.pkg.dev \
  --docker-username=_json_key \
  --docker-password="$(cat /tmp/ar-pull-key.json)" \
  --namespace=default

rm /tmp/ar-pull-key.json

SA key 建完就刪——K8s Secret 裡已經有完整內容, 本機留著是不必要的風險。

deployment.yaml 加上 imagePullSecrets

1
2
3
4
5
6
spec:
  imagePullSecrets:
    - name: ar-secret
  containers:
    - name: go-api
      image: us-east1-docker.pkg.dev/YOUR_PROJECT/go-api/go-api:prod-7639a24

GKE 上不需要這個, node 透過 Workload Identity 天生有 AR pull 權限。ar-secret 是 k3s 本機環境的限制。

完整 GitOps 流程

sequenceDiagram
    participant Dev as Developer
    participant GH as GitHub Actions
    participant AR as Artifact Registry
    participant Git as Git Repo
    participant Argo as ArgoCD
    participant K8s as k3s Cluster

    Dev->>GH: push to main
    GH->>GH: go test + go build
    GH->>AR: docker push prod-{sha}
    GH->>Git: update deployment.yaml image tag
    GH->>Git: push gitops/update-image-{sha} branch
    Dev->>Git: merge PR to main
    loop every 3 minutes
        Argo->>Git: poll for changes
        Git-->>Argo: deployment.yaml changed
    end
    Argo->>K8s: kubectl apply k8s/base/
    K8s->>K8s: rolling update
sequenceDiagram
    participant Dev as Developer
    participant GH as GitHub Actions
    participant AR as Artifact Registry
    participant Git as Git Repo
    participant Argo as ArgoCD
    participant K8s as k3s Cluster

    Dev->>GH: push to main
    GH->>GH: go test + go build
    GH->>AR: docker push prod-{sha}
    GH->>Git: update deployment.yaml image tag
    GH->>Git: push gitops/update-image-{sha} branch
    Dev->>Git: merge PR to main
    loop every 3 minutes
        Argo->>Git: poll for changes
        Git-->>Argo: deployment.yaml changed
    end
    Argo->>K8s: kubectl apply k8s/base/
    K8s->>K8s: rolling update
sequenceDiagram
    participant Dev as Developer
    participant GH as GitHub Actions
    participant AR as Artifact Registry
    participant Git as Git Repo
    participant Argo as ArgoCD
    participant K8s as k3s Cluster

    Dev->>GH: push to main
    GH->>GH: go test + go build
    GH->>AR: docker push prod-{sha}
    GH->>Git: update deployment.yaml image tag
    GH->>Git: push gitops/update-image-{sha} branch
    Dev->>Git: merge PR to main
    loop every 3 minutes
        Argo->>Git: poll for changes
        Git-->>Argo: deployment.yaml changed
    end
    Argo->>K8s: kubectl apply k8s/base/
    K8s->>K8s: rolling update

為什麼 CI 不能直接 push main:main 有 branch protection rule, 要求 PR。CI 用 gitops branch 繞過這個限制, 同時保留 code review 流程。

Drift detection:有人手動 kubectl edit deployment go-api 改了 replicas, ArgoCD 的 selfHeal 會在下次輪詢時自動改回 git 的值。cluster 的狀態永遠由 git 決定。

GKE vs k3s

k3s(本機) GKE
imagePullSecret 需要手動建 不需要, node 有 Workload Identity
Load Balancer 無, 需要 NodePort 原生支援 L4/L7 LB
Workload Identity 不支援 原生支援
費用 免費 e2-small ~$38/80 天

k3s 適合本機學習環境, GKE 更貼近 production。下一個 Phase 會在 GKE 上做 Prometheus + Grafana, 順便移除 ar-secret 的限制。

References