First Production Custom Domain Cutover on Cloud Run: Domain Mapping, Search Console, and Certificate Wait-Window Pitfalls
This post records several real pitfalls I hit when moving a Cloud Run custom domain from an existing environment to the first formal prod service.
On the surface, it looks like only pointing api.example.com to the new Cloud Run service. But what usually blocks you is not Terraform syntax. The real blockers are:
- domain ownership
- IAM identity
- certificate provisioning
- first deploy sequencing
If these are not thought through before execution, first-time prod enablement is very easy to stall at the last step.
Typical Situation Pattern
A common scenario looks like this:
- custom domain already exists
- old mapping points to a non-prod or legacy service
- new prod service is ready
- Terraform now manages the prod service and wants to own the domain mapping too
The most intuitive idea at this point is:
- import the existing mapping
- let Terraform replace it during the first prod deploy
This direction is not wrong, but it pulls in several prerequisites outside the provider.
First Failure Class
The first error I hit was not DNS and not the Cloud Run service itself. It was a message like:
|
|
The key point of this error is not that Cloud Run is broken. It means:
- the current deploy identity can talk to GCP
- but it is not recognized as a domain owner for that domain
For Cloud Run custom domains, permission to create or recreate domain mapping is related to Search Console ownership.
If a new deploy-prod service account is taking over first formal prod cutover, that account must also be recognized as an owner of the domain.
Second Failure Class
This difference is easy to miss.
In Search Console:
- Full user is not the same as Owner
- Full user cannot satisfy domain ownership requirements for this flow
If deploy service account is only added as Full user, Terraform can still fail during domain mapping creation.
The more stable approach is:
- add the deploy-prod service account as an owner
- add it on the parent domain if possible
- wait a few minutes for the permission change to propagate
- rerun the failed deploy
Third Failure Class
During first prod cutover, a common pattern looks like this:
|
|
The benefit is that Terraform can adopt the existing mapping first, then detect route mismatch in plan, then replace.
But note:
- import success only means state ownership is established
- it does not mean the new domain target is ready yet
Whether cutover is truly complete still depends on mapping status and certificate status.
Fourth Failure Class
This is the easiest place to misdiagnose on first production enablement.
You may see:
- DNS already points to
ghs.googlehosted.com DomainRoutable = True- but
curl https://api.example.comstill fails
At this point, DNS may be correct and Terraform may be complete.
A very common reason is managed certificate still provisioning.
I check domain mapping status first:
|
|
If you see states like:
|
|
Then the GCP control plane is basically done. If local HTTPS is still temporarily unreachable, it is often just edge propagation not fully converged yet.
Fifth Failure Class
If domain mapping is switched together with the first prod deploy, sequence must be strict.
I recommend this sequence now:
|
|
The easiest mistake is step 3 and step 4.
If old root still owns domain mapping while new root imports the same mapping, state ownership becomes inconsistent.
Sixth Failure Class
This is not a Cloud Run issue directly, but strongly related to first prod enablement.
If deploy-prod.yml already has push: main, one merge can itself become the first prod deploy.
This means you cannot review that PR with normal feature-merge mindset. You must treat it as a production deployment rollout event.
The safer approach is usually one of two:
- keep the first prod deploy manual
- or complete every prerequisite before the merge that enables auto prod deploy
As long as prerequisites are not finished, main should not auto-trigger first prod cutover.
Verification Model I Use
I verify completion in three layers:
Terraform Layer
- prod service root plans cleanly
- the import block is no longer needed
- the old root no longer owns the mapping
Control-Plane Layer
mappedRouteNamepoints to the prod serviceReady = TrueCertificateProvisioned = TrueDomainRoutable = True
User-Visible Layer
curl -I https://api.example.com/healthsucceeds- opening the domain in a browser works
- service behavior matches prod, not legacy/dev
If only the first two layers pass and the third still fails, I usually do not call cutover complete.
Practical Mental Model
I now split this into two different questions:
- who owns the domain mapping state
- who is authorized to administer the domain
Terraform import handles only the first question.
Search Console ownership and Cloud Run domain authorization handle the second question.
If these are mixed together, it is easy to think import success means everything is ready.
Runtime Verification Commands
During first custom-domain cutover to prod, the most frequent lookups are mapping status and service URL. I keep these commands directly in notes.
Check Full Status Payload
|
|
Check Current Target Name
|
|
List Region-Level Entries
|
|
Check Endpoint Value
|
|
Check Current Container Artifact
|
|
Check Runtime Traffic Detail
|
|
Conclusion
When cutting Cloud Run custom domain to formal prod service for the first time, the hard part is usually not HCL. The hard part is boundary conditions outside the cloud control plane.
The most important points are:
- make sure the deploy identity is a real domain owner
- treat the first prod deploy as a rollout event
- verify mapping status and certificate status separately
- remove one-time import scaffolding after cutover
If these are separated clearly, first prod enablement becomes much clearer.