這篇涵蓋兩個主題: 在 GCP 上手動建立 VPC、Subnet、Firewall Rules、Service Account, 以及 API Gateway 的設計概念與面試常見問題。

## GCP Networking

### VPC: Global vs Regional

GCP VPC 是 **global** 的, Azure VNet 是 **regional** 的。

```
GCP:
  devops-vpc (global)
  ├── subnet: us-east1 (10.0.1.0/24)
  └── subnet: asia-east1 (10.0.2.0/24)   # same VPC, different region
  # VMs in different regions communicate over internal network directly

Azure:
  vnet-eastus (East US)                   # region-scoped
  vnet-westus (West US)                   # separate VNet
  # cross-region requires VNet Peering
```

GCP 的優勢是同一個 VPC 內, 不同 region 的 VM 可以走內網直通, 不需要 Peering。

### Subnet 與 CIDR

Subnet 是 regional 的。建 Subnet 要指定 region 和 IP 範圍 (CIDR)。

```
10.0.1.0/24

10  .  0  .  1  .  0
00001010.00000000.00000001.00000000
|______fixed 24 bits_______| vary |
```

`/24` 後面 8 bits 可變 = 2^8 = 256 個 IP (可用 254 個)。

| CIDR | Usable IPs | Use case |
|------|-----------|---------|
| /8   | 16,777,214 | Large private network |
| /16  | 65,534     | Medium VPC |
| /24  | 254        | Typical subnet |
| /28  | 14         | Small subnet (GCP minimum) |
| /32  | 1          | Single IP (firewall rule) |

`/0` = 所有 IP。`0.0.0.0/0` 就是「任意來源」。

```bash
# create VPC (custom mode: you control subnet creation)
gcloud compute networks create devops-vpc --subnet-mode=custom

# create subnet in us-east1
gcloud compute networks subnets create devops-subnet \
  --network=devops-vpc \
  --region=us-east1 \
  --range=10.0.1.0/24
```

### Firewall Rules

GCP Firewall Rules 掛在 VPC 層, 不是 Subnet 層 (Azure NSG 可以掛在 Subnet 或 NIC)。

Priority 數字越小, 優先度越高。預設 1000, 最高 0, 最低 65535。GCP 有一條隱藏的 implied deny-all (priority 65535), 所有沒有 match 到任何 rule 的流量都會被擋。

```
Priority 500: deny tcp:22 from 1.2.3.4   # matched first -> blocked
Priority 1000: allow tcp:22 from 0.0.0.0/0
# result: 1.2.3.4 is blocked, all other IPs allowed
```

```bash
# allow SSH from anywhere (dev only -- use your own IP in production)
gcloud compute firewall-rules create allow-ssh \
  --network=devops-vpc \
  --allow=tcp:22 \
  --source-ranges=0.0.0.0/0

# allow all internal traffic within the subnet
gcloud compute firewall-rules create allow-internal \
  --network=devops-vpc \
  --allow=tcp,udp,icmp \
  --source-ranges=10.0.1.0/24

# verify source ranges
gcloud compute firewall-rules describe allow-ssh --format="get(sourceRanges)"
```

Production 做法是 `--source-ranges=YOUR_IP/32`, 只允許自己的 IP。

### Service Account vs Managed Identity

Service Account 是給程式或服務用的身分, 不是給人用的。

| | GCP Service Account | Azure Managed Identity |
|--|--------------------|-----------------------|
| 建立方式 | 手動建立, 手動指派 | System-assigned 隨資源自動建立 |
| Credentials | 可下載 JSON key (但不建議) | 完全沒有 key, Azure 管 token lifecycle |
| 生命週期 | 獨立存在, 手動管理 | System-assigned 隨資源一起刪除 |

Azure 設計上就把 key 的路堵死了, 是更安全的預設值。GCP 的 SA key 下載太方便, 是常見的安全漏洞來源。

解法是 **Workload Identity Federation (WIF)**: 讓 GitHub Actions 或 GKE Pod 用 OIDC token 換 GCP 權限, 整個流程不需要 SA key。

```
GitHub Actions
  -> generate OIDC token
    -> WIF validates token
      -> exchange for SA permissions
        -> access GCP resources (no key involved)
```

SA + WIF = 跟 Managed Identity 同等安全, 但需要主動設定。

```bash
# create service account
gcloud iam service-accounts create devops-cicd-sa \
  --display-name="DevOps CI/CD SA" \
  --project=devops-lab-lou-2026

# verify
gcloud iam service-accounts list --project=devops-lab-lou-2026
```

## API Gateway

### 定義

API Gateway 是所有請求的統一入口, 負責處理橫切關注點 (cross-cutting concerns), 讓後端服務只需要專注在業務邏輯。

| 在 Gateway 處理 | 後端服務就不用管 |
|----------------|----------------|
| TLS termination | 不用處理 HTTPS |
| Auth (JWT / API key) | 不用每個服務各自驗 token |
| Rate limiting | 不用自己擋暴力請求 |
| Caching | 不用每個服務自己做 cache |
| Routing | 統一管理 path → service 對應 |

### SSL Termination

TLS handshake 有 CPU 開銷。讓 Gateway 處理 SSL, 後端和 Gateway 之間走 HTTP 內網通訊。

```
Client  --HTTPS-->  API Gateway  --HTTP-->  Backend Pod A
                    (decrypt here)          Backend Pod B
```

好處:
- 憑證只在 Gateway 管一份, 到期只更新一個地方
- 後端服務省去加解密的 CPU 消耗

Gateway 和後端在同一個 VPC 內, 內部 HTTP 通常可以接受。需要更嚴格的場景 (金融、醫療) 可以加 mTLS, 後端之間也加密。

### Caching

讀多寫少、回應不常變動的 API 適合在 Gateway 層 cache。

```
Without cache:
  User A -> GET /products -> Gateway -> Backend -> DB
  User B -> GET /products -> Gateway -> Backend -> DB
  # 100,000 requests = 100,000 DB queries

With cache:
  User A -> GET /products -> Gateway -> Backend -> DB  (first request, store result)
  User B -> GET /products -> Gateway -> return cached  (no backend hit)
  # 100,000 requests = 1 DB query
```

個人化的回應 (`GET /cart`, `GET /orders`) 不適合 cache, 因為每個用戶的資料不同。

### BFF Pattern

不同客戶端對 API 的需求差異越大, 用同一個 Gateway 服務所有客戶端就會出現各種妥協。BFF (Backend for Frontend) 的解法: 為不同類型的客戶端各自維護一個 Gateway。

```
Web BFF    ---> User Service
Mobile BFF ---> Order Service   (same backend, different gateway)
Public BFF ---> Product Service
```

Web BFF 可以做大量資料聚合, Mobile BFF 只回傳精簡欄位, Public BFF 做嚴格版本控制。後端微服務保持不變。

適合: 有多個差異明顯的客戶端, 且每個客戶端團隊有能力維護自己的 BFF。小型團隊或客戶端差異不大時, 一個統一 Gateway 就夠了。

GraphQL 是另一個解法: 客戶端自己決定要哪些欄位, 同一個 endpoint 服務所有客戶端。代價是 schema 設計複雜度和 N+1 query 問題。

### 架構與高可用

```
Internet
  |
L7 LB          # distributes traffic across Gateway instances, health checks
  |
API Gateway x N instances (stateless)
  |- Auth (JWT / API key validation)
  |- Rate limiting (counters in Redis, shared across instances)
  |- Routing (/users -> User Service, /orders -> Order Service)
  |- SSL Termination
  |
L4 LB          # distributes traffic to backend pods
  |
Backend microservices
```

Gateway 是**無狀態**的, 路由規則存在設定檔或 DB, 任何 instance 都能處理任何請求, 水平擴展自然。

Rate limiting 是有狀態的: 計數器存在 Redis, 所有 instance 共享同一份數據。

```
API Gateway instance A --|
API Gateway instance B --|---> Redis (rate limit counters)
API Gateway instance C --|
```

沒有共享 Redis 的話, 用戶可以同時打三個 instance 各打 100 次, 繞過 100 次/分鐘的限制。

### 常見面試問題

**Q: Is the Gateway a SPOF?**

> No. Deploy multiple instances behind a Load Balancer. Gateway is stateless — any instance can handle any request — so horizontal scaling is straightforward.

**Q: How do you share rate limit state across instances?**

> Store rate limit counters in Redis. All Gateway instances read and write to the same Redis, so the limit is enforced globally regardless of which instance handles the request.

**Q: What's the difference between a Gateway and a Load Balancer?**

> LB distributes traffic at L4/L7. Gateway handles application-level concerns — auth, rate limiting, routing logic. LB sits in front of the Gateway for HA; Gateway sits in front of your services for business logic.

### Consistent Hashing

多個 LB instance 如何在不共享狀態的情況下做路由決策?

Round Robin 需要計數器 (「我上次給哪台」), 多個 instance 各自維護計數器, 分流會不均衡。

Consistent Hashing 用確定性算法: 同一個輸入永遠得到同一個輸出, 不依賴任何外部狀態。

```
hash(client IP) % number_of_backends = which backend to route to

IP ending in 0-3 -> Backend A
IP ending in 4-6 -> Backend B
IP ending in 7-9 -> Backend C
```

任何 LB instance 收到同一個 IP, 算出來的結果都一樣。不需要溝通, 不需要共享狀態。

| Algorithm | Needs shared state | Reason |
|-----------|-------------------|--------|
| Round Robin | Yes | requires counter |
| Least Connections | Yes | requires connection count per backend |
| Random | No | stateless by nature |
| Consistent Hashing | No | deterministic function, same input = same output |

## References

- [GCP VPC overview](https://cloud.google.com/vpc/docs/overview)
- [GCP Firewall Rules](https://cloud.google.com/firewall/docs/firewalls)
- [GCP Service Accounts](https://cloud.google.com/iam/docs/service-account-overview)
- [Workload Identity Federation](https://cloud.google.com/iam/docs/workload-identity-federation)
- [API Gateway concepts - Google Cloud](https://cloud.google.com/api-gateway/docs/about-api-gateway)
- [Consistent Hashing - Tom White](https://tom-e-white.com/2007/11/consistent-hashing.html)
- [BFF Pattern - Sam Newman](https://samnewman.io/patterns/architectural/bff/)
- [Rate Limiting with Redis](https://redis.io/glossary/rate-limiting/)