Engineering

i18n Health Checks: Catch Missing Translations Before They Ship

Eray Gündoğmuş
Eray Gündoğmuş
·8 min read
Share
i18n Health Checks: Catch Missing Translations Before They Ship

Missing translations are the kind of bug that slips through every stage of your pipeline. Unit tests pass because they do not check translation files. Integration tests pass because they run in the default language. QA passes because manual testing rarely covers all twelve languages. And then a user in Brazil sees checkout.confirm_button where a button label should be, and you get a bug report that makes your team look careless.

The problem is not that teams forget about translations. It is that there is no automated check to catch translation gaps the way a type checker catches type errors or a linter catches code style issues. Your code has ESLint, Prettier, TypeScript, and a full CI pipeline. Your translations have... a JSON file that someone hopefully remembered to update.

This post covers how to implement automated i18n health checks that catch missing translations, placeholder mismatches, orphan keys, and hardcoded strings before they reach production.


What Does an i18n Health Check Actually Check?

A comprehensive i18n health check evaluates four dimensions of your translation setup:

1. Coverage: Are All Keys Translated?

Coverage is the most straightforward check and the most impactful. For every translation key used in your source code, does a translation exist in every target language?

Source code references: 1,247 keys
English (source):      1,247/1,247 (100%)
Spanish:               1,235/1,247 (99%)
French:                1,198/1,247 (96%)
Japanese:              1,150/1,247 (92%)
Korean:                1,089/1,247 (87%)

A coverage check catches the most common scenario: a developer adds a new feature, writes the English strings, and moves on to the next task. The keys go into the English JSON file but never get sent for translation. Without a coverage check, the gap is invisible until a user encounters it.

Coverage checks also catch namespace mismatches. If your code references t('checkout.confirm') but the checkout namespace does not exist for Korean, that is a coverage gap that will show a raw key to Korean users.

2. Quality: Are Translations Structurally Correct?

Coverage only tells you whether a translation exists. Quality tells you whether it will work correctly at runtime.

The most critical quality check is placeholder validation. If your English string is:

"You have {count} items in your cart, {name}."

Then every translation of that string must contain exactly {count} and {name}. A French translator who writes {nombre} instead of {count} creates a runtime bug — the interpolation engine will not find a value for {nombre} and will either display the raw placeholder or throw an error.

Other quality checks include:

  • Empty values: Keys that exist in a language file but have empty strings. These usually indicate programmatic key creation without actual translation.
  • Source-identical strings: Translations that are character-for-character identical to the source language. Some strings (brand names, URLs) are legitimately identical, but a high count usually means untranslated content.
  • Excessive length: Translations that are significantly longer than the source, which may overflow UI containers. German translations are notoriously 30-40% longer than English.

3. Structure: Are Translation Files Clean?

Structure checks evaluate the organization and hygiene of your translation files:

  • Orphan keys: Keys present in translation files but never referenced in source code. These accumulate when features are removed but translation files are not cleaned up. They waste translator effort and create confusion.
  • Duplicate keys: The same key defined twice in a single file. JSON does not error on duplicate keys — it silently uses the last one, which can lead to confusing behavior.
  • Naming inconsistency: If 90% of your keys use snake_case but a few use camelCase, the inconsistency makes keys harder to find and maintain.

4. Code: Are Strings Properly Internationalized?

Code analysis uses AST parsing to find hardcoded strings in your source files that should be wrapped in translation functions.

// Flagged: hardcoded user-facing string
<h1>Welcome to our app</h1>

// Not flagged: properly internationalized
<h1>{t('home.welcome_title')}</h1>

// Not flagged: non-user-facing (CSS class, data attribute)
<div className="container" data-testid="home">

This check catches i18n debt at the source. New developers who are not familiar with your i18n setup write hardcoded strings. Without an automated check, these strings persist until someone notices them during a translation audit.


The Health Score: Reducing Complexity to a Number

Individual checks produce detailed reports, but for CI integration and trend tracking, you need a single number: the health score.

A well-designed health score weighs categories by their user impact:

CategoryWeightRationale
Coverage40%Missing translations directly affect users
Quality30%Placeholder bugs cause runtime errors
Structure20%Orphan keys waste effort but do not break UX
Code10%Hardcoded strings are debt, not immediate breakage

A project scoring 87/100 might break down as:

Overall: 87/100 PASSED

Coverage     92/100  ████████████████████  3 missing keys
Quality      85/100  █████████████████░░░  2 placeholder mismatches
Structure    78/100  ███████████████░░░░░  12 orphan keys
Code         90/100  ██████████████████░░  4 hardcoded strings

The pass/fail threshold is configurable. A threshold of 80 is practical for most teams — strict enough to catch real issues, lenient enough that minor warnings do not block deploys.


Setting Up i18n Health Checks in CI/CD

The real value of health checks comes from running them automatically on every pull request. Here is how to set up a GitHub Actions workflow:

# .github/workflows/i18n-doctor.yml
name: i18n Health Check

on:
  pull_request:
    paths:
      - "locales/**"
      - "src/**"
      - "messages/**"

jobs:
  doctor:
    runs-on: ubuntu-latest
    steps:
      - uses: actions/checkout@v4

      - name: Setup Bun
        uses: oven-sh/setup-bun@v2

      - name: Install dependencies
        run: bun install

      - name: Run i18n Doctor
        run: bunx @better-i18n/cli doctor --ci --threshold 80
        env:
          BETTER_I18N_API_KEY: ${{ secrets.BETTER_I18N_API_KEY }}

What the --ci Flag Does

The --ci flag changes Doctor's behavior for CI environments:

  1. Exit code: Returns exit code 1 if the scan fails, causing the GitHub Actions job to fail
  2. GitHub annotations: Outputs issues in GitHub Actions annotation format, so they appear as inline comments on the PR diff
  3. Summary: Produces a structured summary for the GitHub Actions check output
  4. Non-interactive: Suppresses progress bars and color output

Path Filters Matter

The paths filter in the workflow configuration is important for performance. Without it, the health check runs on every PR, including PRs that only change documentation or backend code with no translation impact. Filter to your translation file directories and your source code directories.

Reporting Results to the Platform

Add the --report flag to submit results to your translation management platform:

- name: Run i18n Doctor
  run: bunx @better-i18n/cli doctor --ci --report --threshold 80
  env:
    BETTER_I18N_API_KEY: ${{ secrets.BETTER_I18N_API_KEY }}

Reports include the commit SHA, branch name, file count, and key count. Over time, this builds a history of your i18n health that you can use to track improvements, catch regressions, and set team goals.

Auto-Generated Workflows

If configuring GitHub Actions manually seems like unnecessary friction, some translation platforms (including Better i18n) can create the workflow file for you. The platform uses the GitHub API to open a PR on your repository with a pre-configured workflow file. You review it, merge it, and the health check is active.


What Happens When a Check Fails

A failing health check should provide actionable information, not just a red X. Here is what a useful failure looks like:

Missing Translation Keys

Error: 12 keys missing in target languages

  checkout.confirm_order
    Missing in: fr, de, ja, ko
    Added in commit: abc1234 (2 days ago)
    File: src/pages/Checkout.tsx:45

  checkout.payment_method
    Missing in: fr, de, ja, ko
    Added in commit: abc1234 (2 days ago)
    File: src/pages/Checkout.tsx:52

The developer sees exactly which keys are missing, in which languages, when they were added, and where they are used. The fix is clear: request translations for these keys before merging.

Placeholder Mismatches

Error: Placeholder mismatch in notifications.new_messages (de)
  Source:  "You have {count} new messages from {sender}"
  Target:  "Sie haben {anzahl} neue Nachrichten von {sender}"
  Missing: {count}
  Extra:   {anzahl}

The developer or translator sees the exact mismatch and can fix the German translation to use {count} instead of {anzahl}.

Hardcoded Strings

Warning: Hardcoded string in JSX (src/components/Header.tsx:23)
  <h1>Welcome back!</h1>
  Suggestion: <h1>{t('header.welcome_back')}</h1>

This is a warning, not an error — it will not block the PR by default. But it shows up in the report and contributes to the code analysis score.


Real-World Impact: Before and After

Before: The Manual Process

  1. Developer adds new feature with 30 new keys
  2. Developer adds English translations
  3. Developer opens PR, which is reviewed and merged
  4. Two weeks later, QA tests the feature in French — finds 30 raw keys
  5. QA files a bug report
  6. Developer creates a ticket for translations
  7. Translator provides French translations
  8. Developer commits translation file, opens new PR
  9. Repeat for German, Japanese, Korean, etc.

Time from code merge to fully translated feature: 3-6 weeks.

After: Automated Health Checks

  1. Developer adds new feature with 30 new keys
  2. Developer adds English translations
  3. Developer opens PR
  4. CI runs i18n Doctor — fails with "30 keys missing in fr, de, ja, ko"
  5. Developer requests translations via the platform
  6. Translations arrive (AI-generated in minutes, human-reviewed in hours)
  7. Developer adds translations to PR
  8. CI re-runs — passes
  9. PR merges with all languages complete

Time from code merge to fully translated feature: Same day.

The difference is not just speed — it is about catching the issue in the right place. A CI check catches missing translations in the same PR where the keys were added, when the developer still has full context about the feature. A bug report three weeks later requires the developer to context-switch back to a feature they have already moved on from.


Tracking Health Over Time

A single health score is useful for pass/fail gating. A history of health scores is useful for understanding trends.

When you submit Doctor reports to a platform dashboard, you can track:

  • Score trajectory: Is your i18n health improving, stable, or degrading?
  • Category trends: Maybe your coverage is excellent but orphan keys are accumulating. The category breakdown shows where to focus cleanup efforts.
  • Per-branch comparison: Feature branches often have lower scores (new keys without translations). The main branch should maintain a consistently high score.
  • Cross-project comparison: For organizations with multiple products, compare i18n health across projects to identify which ones need attention.

Setting Team Goals

Health scores make it possible to set measurable i18n goals:

  • "Maintain 90+ health score on main branch" — a quality standard
  • "Reduce orphan keys from 200 to 50 by end of quarter" — a cleanup initiative
  • "Zero placeholder mismatches" — a zero-defect target for the most critical check

Common Objections and Responses

"We only have two languages, we do not need this." Two languages are enough for coverage gaps and placeholder mismatches to cause user-visible bugs. The health check is lightweight — it adds seconds to your CI pipeline, not minutes.

"Our translators handle quality." Translators ensure linguistic quality. Health checks ensure technical quality — placeholder correctness, key coverage, file structure. These are different concerns. A translator cannot know whether a key is referenced in your source code.

"We will add this later when we have more languages." i18n debt compounds. The orphan keys, hardcoded strings, and inconsistent naming you accumulate with two languages become much harder to fix when you add a third, fourth, and fifth language. Starting health checks early is cheaper than retrofitting them.

"Our CI is already slow." A Doctor scan of a 10,000-key project with 8 languages takes under 10 seconds. Use --skip-code to drop AST analysis and cut that to under 3 seconds. The path filter in the GitHub Actions configuration ensures the check only runs on PRs that touch translation-related files.


Getting Started

If you are using Better i18n, the Doctor command is built into the CLI:

# Install the CLI
bun add -g @better-i18n/cli

# Run your first health check
bi18n doctor

# Run in CI mode with reporting
bi18n doctor --ci --report --threshold 80

If you are not using Better i18n, the principles in this post apply to any translation setup. You can build similar checks with custom scripts that:

  1. Parse your source code for translation key references
  2. Compare referenced keys against your translation files
  3. Validate placeholder consistency
  4. Output results in your CI system's annotation format

The important thing is not which tool you use. It is that missing translations stop being a surprise found by users and start being a CI check found by developers.


Missing translations are preventable bugs. Start catching them in CI — set up Better i18n Doctor and run your first health check today.