From Runtime Translation to Real English Content, Why I Built Content-I18n
My Blog already had cmpt-translate, so pages could be translated into English at runtime
But during coffee chats in Toronto, someone directly suggested this: if the Blog is part of my portfolio, the default language should be English
That was not about a missing feature
It was about what the main body of the content really is
Runtime translation can turn a page into another language, but it does not mean I really have English content, and it does not mean I have a workflow for maintaining English posts over time
So I started building content-i18n
The goal was clear: move the Blog from “Chinese as the main content, English through runtime translation” to real English posts that I can maintain
The problem was not translating faster
If I only wanted an English draft, there were many ways to do it:
- paste the post into AI Chat
- use Google Translate or DeepL
- call a translation API to generate a draft
- let an agent rewrite it into English
All of these work, and they are fast
But a technical Blog is not only prose
A post usually mixes:
- heading and section order
- code blocks
- inline commands
- config keys
- product names
- error strings
- tables
- blockquotes
- argument flow
I wanted to keep the same article, not only switch the language
So the core rule is simple:
- same article
- different language only
What fidelity-first means
Fidelity-first does not mean word-by-word translation, and it does not mean the English should sound stiff
It handles one trade-off: when “reads better” and “preserves the source” start conflicting, which side should the workflow protect first
My answer is to protect the source first
So content-i18n checks these things:
- heading hierarchy
- section order
- paragraph coverage
- list structure
- table structure
- code blocks
- technical inline literals
- links and references
- argument flow
- conclusion scope
If the English reads smoothly but drops a caveat, compresses an example, or rewrites a command into another form, that is not the result I want
That is why I started treating translation as a validation problem, not only a generation quality problem
Where runtime translation was not enough
The hardest problem was not that the model translated everything completely wrong
The more common problem was that the output looked close enough, but the content slowly drifted
Problems I saw included:
- the opening paragraph became more twisted than the source, or added content that was not in the source
- a heading became more like an article title, but the original point disappeared
- the code block stayed, but the surrounding sentence drifted
- one section got compressed because AI thought it was repetitive
- Chinese still remained in the English draft
These problems do not always make the article look broken, but they slowly turn it into another article
So I did not want a process where I copy-paste everything and reread the whole post from scratch every time
Workflow design
What I needed was a workflow, not a prompt
|
|
Each step has its own responsibility
prepare
prepare is not only reading the source file
It defines the translation unit:
- source path
- target path
- structure fingerprint
- glossary
- style pack
- prompt context
- target metadata
This step affects output stability
If the context given to AI or provider changes every time, the target is hard to keep stable
write target
write target creates the draft
The draft can come from a human, an AI model, or a provider-backed workflow
But a target file existing does not mean it is complete
A draft is still only a draft
review
review is the most important step in the workflow
It cannot only check whether the English reads well or whether the grammar is correct
It needs to check:
- are headings aligned
- are blockquotes and tables still there
- were code blocks changed
- did technical inline literals drift
- is source language still left where translation should exist
For knowledge-management content, this kind of review is more useful than only checking grammar
sync-status
sync-status writes completion into official state
Without this step, queue cannot clearly tell the difference between:
- a draft someone edited
- a translation that was reviewed and is still synced with the source
Queue state
Queue state made the tool feel more like a system, not just a set of commands
content-i18n uses three states:
completedstalemissing
These states are derived from:
- source discovery
- expected target file
- source hash
- translation status
Translation is not a one-time task
A target completed today can become stale tomorrow if the source changes
My Blog posts change over time: I rewrite sentences, add paragraphs, adjust examples, and fix structure
So the workflow cannot only ask whether something has ever been translated
It also has to ask whether it still matches the source now
Completion rule
Completion cannot be fuzzy
I ended up using this rule:
- review or validation has to pass
- source language cannot remain where translation should exist
sync-statushas to succeed
This is much stricter than “looks close enough”
I often saw posts where the structure looked fine, but headings, metadata, or inline notes still had unfinished translation left behind
If those are not blocked, they can easily reach published content
That is where sync-status matters
It turns completion from a subjective feeling into state the system can derive and track
Why it became a standalone tool
If this was only for translating the Blog, a few scripts could work
But later I needed:
- queue model
- provider options
- repeatable prepare / review / sync steps
- stricter validation
- MCP interface for agents
Once those requirements appeared, scripts were no longer clear enough
Making it a standalone tool gave the boundary a cleaner shape:
- config stays in the consumer repo
- provider secrets stay outside the repo
- workflow logic stays inside the tool
- site routing and theme do not mix with the translation engine
That made it more like a tool I can maintain over time, not a script patched together only for the Blog
Final notes
The first problem content-i18n tried to solve was simple: move the Blog from runtime translation to real English content that I can maintain
After implementation, the problem became:
- how to keep the same article
- what must be preserved
- what counts as complete
- how queue should track translation state
What I needed was not one generated English draft, but a workflow for maintaining English content over time