From Runtime Translation to Real English Content, Why I Built Content-I18n

My Blog already had cmpt-translate, so pages could be translated into English at runtime

But during coffee chats in Toronto, someone directly suggested this: if the Blog is part of my portfolio, the default language should be English

That was not about a missing feature

It was about what the main body of the content really is

Runtime translation can turn a page into another language, but it does not mean I really have English content, and it does not mean I have a workflow for maintaining English posts over time

So I started building content-i18n

The goal was clear: move the Blog from “Chinese as the main content, English through runtime translation” to real English posts that I can maintain

The problem was not translating faster

If I only wanted an English draft, there were many ways to do it:

  • paste the post into AI Chat
  • use Google Translate or DeepL
  • call a translation API to generate a draft
  • let an agent rewrite it into English

All of these work, and they are fast

But a technical Blog is not only prose

A post usually mixes:

  • heading and section order
  • code blocks
  • inline commands
  • config keys
  • product names
  • error strings
  • tables
  • blockquotes
  • argument flow

I wanted to keep the same article, not only switch the language

So the core rule is simple:

  • same article
  • different language only

What fidelity-first means

Fidelity-first does not mean word-by-word translation, and it does not mean the English should sound stiff

It handles one trade-off: when “reads better” and “preserves the source” start conflicting, which side should the workflow protect first

My answer is to protect the source first

So content-i18n checks these things:

  • heading hierarchy
  • section order
  • paragraph coverage
  • list structure
  • table structure
  • code blocks
  • technical inline literals
  • links and references
  • argument flow
  • conclusion scope

If the English reads smoothly but drops a caveat, compresses an example, or rewrites a command into another form, that is not the result I want

That is why I started treating translation as a validation problem, not only a generation quality problem

Where runtime translation was not enough

The hardest problem was not that the model translated everything completely wrong

The more common problem was that the output looked close enough, but the content slowly drifted

Problems I saw included:

  • the opening paragraph became more twisted than the source, or added content that was not in the source
  • a heading became more like an article title, but the original point disappeared
  • the code block stayed, but the surrounding sentence drifted
  • one section got compressed because AI thought it was repetitive
  • Chinese still remained in the English draft

These problems do not always make the article look broken, but they slowly turn it into another article

So I did not want a process where I copy-paste everything and reread the whole post from scratch every time

Workflow design

What I needed was a workflow, not a prompt

1
2
3
4
5
prepare
  -> write target
  -> review
  -> fix
  -> sync-status

Each step has its own responsibility

prepare

prepare is not only reading the source file

It defines the translation unit:

  • source path
  • target path
  • structure fingerprint
  • glossary
  • style pack
  • prompt context
  • target metadata

This step affects output stability

If the context given to AI or provider changes every time, the target is hard to keep stable

write target

write target creates the draft

The draft can come from a human, an AI model, or a provider-backed workflow

But a target file existing does not mean it is complete

A draft is still only a draft

review

review is the most important step in the workflow

It cannot only check whether the English reads well or whether the grammar is correct

It needs to check:

  • are headings aligned
  • are blockquotes and tables still there
  • were code blocks changed
  • did technical inline literals drift
  • is source language still left where translation should exist

For knowledge-management content, this kind of review is more useful than only checking grammar

sync-status

sync-status writes completion into official state

Without this step, queue cannot clearly tell the difference between:

  • a draft someone edited
  • a translation that was reviewed and is still synced with the source

Queue state

Queue state made the tool feel more like a system, not just a set of commands

content-i18n uses three states:

  • completed
  • stale
  • missing

These states are derived from:

  • source discovery
  • expected target file
  • source hash
  • translation status

Translation is not a one-time task

A target completed today can become stale tomorrow if the source changes

My Blog posts change over time: I rewrite sentences, add paragraphs, adjust examples, and fix structure

So the workflow cannot only ask whether something has ever been translated

It also has to ask whether it still matches the source now

Completion rule

Completion cannot be fuzzy

I ended up using this rule:

  1. review or validation has to pass
  2. source language cannot remain where translation should exist
  3. sync-status has to succeed

This is much stricter than “looks close enough”

I often saw posts where the structure looked fine, but headings, metadata, or inline notes still had unfinished translation left behind

If those are not blocked, they can easily reach published content

That is where sync-status matters

It turns completion from a subjective feeling into state the system can derive and track

Why it became a standalone tool

If this was only for translating the Blog, a few scripts could work

But later I needed:

  • queue model
  • provider options
  • repeatable prepare / review / sync steps
  • stricter validation
  • MCP interface for agents

Once those requirements appeared, scripts were no longer clear enough

Making it a standalone tool gave the boundary a cleaner shape:

  • config stays in the consumer repo
  • provider secrets stay outside the repo
  • workflow logic stays inside the tool
  • site routing and theme do not mix with the translation engine

That made it more like a tool I can maintain over time, not a script patched together only for the Blog

Final notes

The first problem content-i18n tried to solve was simple: move the Blog from runtime translation to real English content that I can maintain

After implementation, the problem became:

  • how to keep the same article
  • what must be preserved
  • what counts as complete
  • how queue should track translation state

What I needed was not one generated English draft, but a workflow for maintaining English content over time

References

  1. content-i18n GitHub Repository
  2. Model Context Protocol — What is MCP?
  3. Model Context Protocol — Architecture Overview
  4. DeepL Documentation — Supported Languages
  5. GitHub Docs — Reusing Workflows