GCP Networking Fundamentals and API Gateway Design

Lou Chang included in DevOps GCP SystemDesign

2026-04-11 About 1500 words 7 minutes

Contents

This post covers two topics: manually creating VPC, Subnet, Firewall Rules, and Service Account on GCP, and API Gateway design concepts along with common interview questions.

GCP Networking

VPC: Global vs Regional

GCP VPC is global, Azure VNet is regional.

 1
 2
 3
 4
 5
 6
 7
 8
 9
10


GCP:
  devops-vpc (global)
  ├── subnet: us-east1 (10.0.1.0/24)
  └── subnet: asia-east1 (10.0.2.0/24)   # same VPC, different region
  # VMs in different regions communicate over internal network directly

Azure:
  vnet-eastus (East US)                   # region-scoped
  vnet-westus (West US)                   # separate VNet
  # cross-region requires VNet Peering

GCP’s advantage is that within the same VPC, VMs in different regions can communicate directly over the internal network without Peering.

VPC Subdivision and Address Ranges

Subnets are regional. When creating a Subnet, you specify the region and IP range (CIDR).

1
2
3
4
5


10.0.1.0/24

10  .  0  .  1  .  0
00001010.00000000.00000001.00000000
|______fixed 24 bits_______| vary |

/24 means the last 8 bits are variable = 2^8 = 256 IPs (254 usable).

CIDR	Usable IPs	Use case
/8	16,777,214	Large private network
/16	65,534	Medium VPC
/24	254	Typical subnet
/28	14	Small subnet (GCP minimum)
/32	1	Single IP (firewall rule)

/0 = all IPs. 0.0.0.0/0 means “any source”.

1
2
3
4
5
6
7
8


# create VPC (custom mode: you control subnet creation)
gcloud compute networks create devops-vpc --subnet-mode=custom

# create subnet in us-east1
gcloud compute networks subnets create devops-subnet \
  --network=devops-vpc \
  --region=us-east1 \
  --range=10.0.1.0/24

Firewall Rules

GCP Firewall Rules are attached at the VPC level, not the Subnet level (Azure NSG can be attached at Subnet or NIC).

Lower priority numbers mean higher priority. Default is 1000, highest is 0, lowest is 65535. GCP has an implied deny-all rule (priority 65535) — any traffic that does not match any rule is blocked.

1
2
3


Priority 500: deny tcp:22 from 1.2.3.4   # matched first -> blocked
Priority 1000: allow tcp:22 from 0.0.0.0/0
# result: 1.2.3.4 is blocked, all other IPs allowed

 1
 2
 3
 4
 5
 6
 7
 8
 9
10
11
12
13
14


# allow SSH from anywhere (dev only -- use your own IP in production)
gcloud compute firewall-rules create allow-ssh \
  --network=devops-vpc \
  --allow=tcp:22 \
  --source-ranges=0.0.0.0/0

# allow all internal traffic within the subnet
gcloud compute firewall-rules create allow-internal \
  --network=devops-vpc \
  --allow=tcp,udp,icmp \
  --source-ranges=10.0.1.0/24

# verify source ranges
gcloud compute firewall-rules describe allow-ssh --format="get(sourceRanges)"

Production practice is --source-ranges=YOUR_IP/32, only allowing your own IP.

Service Account vs Managed Identity

A Service Account is an identity for programs or services, not for people.

	GCP Service Account	Azure Managed Identity
Creation	Manually created, manually assigned	System-assigned auto-created with resource
Credentials	JSON key downloadable (but not recommended)	No key at all, Azure manages token lifecycle
Lifecycle	Exists independently, manually managed	System-assigned deleted when resource is deleted

Azure’s design blocks the key path entirely, making it a more secure default. GCP’s SA key download is too convenient, making it a common source of security vulnerabilities.

The recommended replacement is Workload Identity Federation (WIF): let GitHub Actions or GKE Pods exchange OIDC tokens for GCP permissions. The entire flow requires no SA key.

1
2
3
4
5


GitHub Actions
  -> generate OIDC token
    -> WIF validates token
      -> exchange for SA permissions
        -> access GCP resources (no key involved)

SA + WIF = equivalent security to Managed Identity, but requires proactive setup.

1
2
3
4
5
6
7


# create service account
gcloud iam service-accounts create devops-cicd-sa \
  --display-name="DevOps CI/CD SA" \
  --project=devops-lab-lou-2026

# verify
gcloud iam service-accounts list --project=devops-lab-lou-2026

Unified Request Front Door

Unified Entry Point

API Gateway is the unified entry point for all requests. It handles cross-cutting concerns so backend services only need to focus on business logic.

Handled at Gateway	Backend services don’t need to
TLS termination	No need to handle HTTPS
Auth (JWT / API key)	No need for each service to validate tokens separately
Rate limiting	No need to block brute-force requests themselves
Caching	No need for each service to implement its own cache
Routing	Unified management of path → service mapping

TLS Offloading at the Edge

TLS handshake has CPU overhead. Let the Gateway handle SSL, while backend and Gateway communicate over HTTP on the internal network.

1
2


Client  --HTTPS-->  API Gateway  --HTTP-->  Backend Pod A
                    (decrypt here)          Backend Pod B

Benefits:

Certificates only need to be managed at the Gateway, renewed in one place
Backend services save the CPU cost of encryption/decryption

Gateway and backends are within the same VPC, so internal HTTP is usually acceptable. For stricter requirements (finance, healthcare), mTLS can be added to encrypt between backends as well.

Reply Store-and-Resend at the Perimeter

APIs with high read-to-write ratios and infrequently changing responses are suitable for caching at the Gateway layer.

1
2
3
4
5
6
7
8
9


Without cache:
  User A -> GET /products -> Gateway -> Backend -> DB
  User B -> GET /products -> Gateway -> Backend -> DB
  # 100,000 requests = 100,000 DB queries

With cache:
  User A -> GET /products -> Gateway -> Backend -> DB  (first request, store result)
  User B -> GET /products -> Gateway -> return cached  (no backend hit)
  # 100,000 requests = 1 DB query

Personalized responses (GET /cart, GET /orders) are not suitable for caching because each user’s data is different.

Dedicated Frontend Gateways

The more different clients’ API needs diverge, the more compromises appear when using a single Gateway for all clients. BFF (Backend for Frontend) pattern: maintain a separate Gateway for each client type.

1
2
3


Web BFF    ---> User Service
Mobile BFF ---> Order Service   (same backend, different gateway)
Public BFF ---> Product Service

Web BFF can do heavy data aggregation, Mobile BFF returns only compact fields, Public BFF enforces strict versioning. Backend microservices remain unchanged.

Suitable when: there are multiple clearly different client types, and each client team has the capacity to maintain its own BFF. For small teams or when client differences are minimal, a single unified Gateway is sufficient.

GraphQL offers an alternative: clients decide which fields they want, and a single endpoint serves all clients. The tradeoff is schema design complexity and the N+1 query problem.

Multi-Instance Topology and Resilience

 1
 2
 3
 4
 5
 6
 7
 8
 9
10
11
12
13


Internet
  |
L7 LB          # distributes traffic across Gateway instances, health checks
  |
API Gateway x N instances (stateless)
  |- Auth (JWT / API key validation)
  |- Rate limiting (counters in Redis, shared across instances)
  |- Routing (/users -> User Service, /orders -> Order Service)
  |- SSL Termination
  |
L4 LB          # distributes traffic to backend pods
  |
Backend microservices

Gateway is stateless — routing rules are stored in config files or a DB. Any instance can handle any request, and horizontal scaling is natural.

Rate limiting is stateful: counters are stored in Redis, shared across all instances.

1
2
3


API Gateway instance A --|
API Gateway instance B --|---> Redis (rate limit counters)
API Gateway instance C --|

Without shared Redis, a user could hit three instances with 100 requests each, bypassing the 100 requests/minute limit.

Typical Interview Scenarios

Q: Is the Gateway a SPOF?

No. Deploy multiple instances behind a Load Balancer. Gateway is stateless — any instance can handle any request — so horizontal scaling is straightforward.

Q: How do you share rate limit state across instances?

Store rate limit counters in Redis. All Gateway instances read and write to the same Redis, so the limit is enforced globally regardless of which instance handles the request.

Q: What’s the difference between a Gateway and a Load Balancer?

LB distributes traffic at L4/L7. Gateway handles application-level concerns — auth, rate limiting, routing logic. LB sits in front of the Gateway for HA; Gateway sits in front of your services for business logic.

Deterministic Routing Without Shared State

How do multiple LB instances make routing decisions without sharing state?

Round Robin needs a counter (“which one did I give last time”). Multiple instances each maintain their own counter, leading to unbalanced distribution.

Consistent Hashing uses a deterministic algorithm: the same input always produces the same output, without depending on any external state.

1
2
3
4
5


hash(client IP) % number_of_backends = which backend to route to

IP ending in 0-3 -> Backend A
IP ending in 4-6 -> Backend B
IP ending in 7-9 -> Backend C

Any LB instance receiving the same IP computes the same result. No communication needed, no shared state required.

Algorithm	Needs shared state	Reason
Round Robin	Yes	requires counter
Least Connections	Yes	requires connection count per backend
Random	No	stateless by nature
Consistent Hashing	No	deterministic function, same input = same output