功能

i18n Health Check: Automated Translation Quality Monitoring

i18n Health Check: Automated Translation Quality Monitoring

Your application ships in twelve languages. But how do you know that every screen, every error message, every tooltip has been translated? How do you know that a placeholder like {count} in English was not accidentally written as {nombre} in French? How do you know that the keys you deleted from the codebase three sprints ago are not still sitting in your translation files, adding clutter and confusion?

Most teams find out about translation problems from users. A customer in Tokyo sees a raw key like dashboard.welcome_message instead of a greeting. A German user reports that a price displays {amount} instead of the actual number. A QA engineer manually compares JSON files and discovers that the Spanish translation is missing 47 keys that were added last month.

better-i18n Doctor is an automated i18n health check that scans your codebase and translation files, identifies every category of translation issue, and produces a 0-100 health score. It runs locally via the CLI, integrates into your CI/CD pipeline, and reports results to the better-i18n platform for tracking over time.

How the Health Score Works

Doctor produces a single number: a health score from 0 to 100. This score is a weighted aggregate of four categories, each evaluating a different dimension of your i18n quality.

The Four Categories

Coverage measures whether every key used in your code exists in every target language file. A key present in English but missing in Japanese is a coverage gap. Coverage is the most common source of translation problems — new features ship with keys that were never sent for translation, or a developer adds a key to one namespace and forgets to add it to the others.

Quality checks the content of translations for structural correctness. Placeholder mismatches are the primary concern — if the English string has {count} and {name}, the German translation must have exactly the same placeholders. Quality also checks for empty translations (keys that exist but have blank values), excessively long translations that may break UI layouts, and strings that are identical to the source language (which may indicate untranslated content that was copied from English).

Structure evaluates the organization of your translation files. It checks for orphan keys — keys that exist in translation files but are never referenced in your source code. Orphan keys are harmless but create maintenance burden: translators spend time updating strings that no user will ever see, and developers waste time reviewing translations for unused features. Structure also checks for consistent key naming, duplicate keys, and namespace organization.

Code uses AST-level analysis to scan your source code for hardcoded strings that should be internationalized. It detects user-facing strings in JSX components, template literals passed to UI functions, and string constants used in error messages or notifications. This category catches the most common source of i18n debt: a developer writes <p>Loading...</p> instead of <p>{t('common.loading')}</p> because it is faster, intending to fix it later. Doctor finds these strings before they ship.

Score Calculation

Each category produces a sub-score from 0 to 100 based on the ratio of passed checks to total checks. The overall health score is a weighted average:

CategoryWeightWhat It Measures
Coverage40%Missing translation keys across languages
Quality30%Placeholder mismatches, empty values, suspicious content
Structure20%Orphan keys, naming consistency, duplicates
Code10%Hardcoded strings in source code

Coverage is weighted highest because missing translations have the most direct user impact — they result in raw keys or fallback language being shown to users. Code analysis is weighted lowest because hardcoded strings are technical debt that does not immediately break the user experience, though they should be addressed over time.

Pass/Fail Threshold

A scan is marked as passed when the overall score is 80 or above and there are zero errors (as opposed to warnings). Errors are issues that directly affect users — missing translations for complete features, placeholder mismatches that will cause runtime errors, or keys that reference nonexistent namespaces. Warnings are issues that should be fixed but do not break the user experience — orphan keys, inconsistent naming, or hardcoded strings.

You can configure the pass threshold to match your team's standards:

bi18n doctor --threshold 90

Running Doctor Locally

Basic Scan

Run a full health check from your project root:

bi18n doctor

Doctor automatically discovers your translation files based on your better-i18n.yml configuration or by detecting common directory structures (locales/, messages/, i18n/, lib/l10n/). It scans your source files for key usage and cross-references everything to produce the health report.

Output is a structured report showing the overall score, category breakdowns, and individual rule results:

i18n Doctor Report
━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━
Overall Score: 87/100 ✓ PASSED

Coverage     92/100  ██████████████████░░  3 missing keys
Quality      85/100  █████████████████░░░  2 placeholder mismatches
Structure    78/100  ███████████████░░░░░  12 orphan keys
Code         90/100  ██████████████████░░  4 hardcoded strings

Errors: 0  Warnings: 21
━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━

Targeted Scans

Skip categories that are not relevant or too slow for your immediate needs:

# Skip code analysis (faster scan)
bi18n doctor --skip-code

# Skip health/quality checks (only coverage and structure)
bi18n doctor --skip-health

# Skip sync status checks
bi18n doctor --skip-sync

# Verbose output — show every rule result, not just failures
bi18n doctor --verbose

Individual Check Commands

Doctor bundles several checks that can also be run independently:

# Check for missing translation keys across all languages
bi18n check:missing

# Check for orphan keys not referenced in source code
bi18n check:unused

# Run all checks (equivalent to doctor without code analysis)
bi18n check

# Scan source code for hardcoded strings
bi18n scan

# Sync status — compare local files with platform state
bi18n sync --dry-run

Each command produces focused output for its specific concern, which is useful when you are working on fixing a particular category of issues.

CI/CD Integration

GitHub Actions

Doctor is designed to run as a CI check on every pull request. The --ci flag outputs results in a format that GitHub Actions understands, producing inline annotations on the files with issues:

# .github/workflows/i18n-doctor.yml
name: i18n Health Check

on:
  pull_request:
    paths:
      - "locales/**"
      - "src/**"
      - "messages/**"

jobs:
  doctor:
    runs-on: ubuntu-latest
    steps:
      - uses: actions/checkout@v4

      - name: Setup Bun
        uses: oven-sh/setup-bun@v2

      - name: Install dependencies
        run: bun install

      - name: Run i18n Doctor
        run: bunx @better-i18n/cli doctor --ci --threshold 80
        env:
          BETTER_I18N_API_KEY: ${{ secrets.BETTER_I18N_API_KEY }}

When the --ci flag is set, Doctor:

  • Exits with code 1 if the scan fails (score below threshold or errors present), causing the GitHub Actions check to fail
  • Outputs annotations in GitHub Actions format, so issues appear as inline comments on the PR diff
  • Produces a summary that appears in the GitHub Actions check output

Auto-Generated GitHub Actions Workflow

If you do not want to write the workflow file manually, better-i18n can create it for you. In the platform dashboard, navigate to Integrations then GitHub Actions and click Create Doctor Workflow. This creates a pull request on your repository with a pre-configured workflow file tailored to your project's settings.

The auto-generated workflow includes:

  • Path filters matching your translation file locations
  • Your configured threshold
  • API key setup instructions
  • Optional Slack notification on failure

Reporting to the Platform

When Doctor runs with the --report flag and an API key, it submits the full report to the better-i18n platform:

bi18n doctor --report --api-key $BETTER_I18N_API_KEY

The report includes:

  • Score and pass/fail status
  • Error and warning counts per category
  • Individual rule results with affected keys and files
  • Metadata: commit SHA, branch name, number of scanned files, number of checked keys, timestamp

Reports submitted to the platform are stored and displayed in the project dashboard. You can view score trends over time, compare reports between branches, and track which categories are improving or degrading.

CI Report Submission

In CI environments, combine --ci and --report to both validate the PR and submit the report:

- name: Run i18n Doctor
  run: bunx @better-i18n/cli doctor --ci --report --threshold 85
  env:
    BETTER_I18N_API_KEY: ${{ secrets.BETTER_I18N_API_KEY }}

This gives you two feedback loops:

  1. Immediate: The PR check passes or fails, and developers see inline annotations
  2. Historical: The report is stored on the platform for trend analysis and team visibility

Rule Details

Doctor evaluates a set of rules within each category. Here are the most impactful rules and what they detect.

Coverage Rules

missing-keys: For every key used in your source code, checks whether a translation exists in every target language file. Missing keys are the most common i18n issue and the most user-visible — they result in raw key names or fallback language being displayed.

namespace-coverage: Checks that every namespace referenced in your code has corresponding translation files for all target languages. A developer might add t('checkout.confirm_order') but the checkout namespace file does not exist for Korean.

Quality Rules

placeholder-mismatch: Compares placeholders between source and target translations. If en: "Hello {name}, you have {count} items" exists, checks that every other language's translation contains exactly {name} and {count}. Extra or missing placeholders cause runtime errors or display raw placeholder syntax.

empty-translation: Flags keys that exist in a target language but have empty string values. Empty translations are often the result of adding keys programmatically without providing actual translated content.

source-identical: Flags translations that are character-for-character identical to the source language. While some strings (brand names, URLs, technical terms) are legitimately identical across languages, a high number of source-identical strings usually indicates untranslated content.

Structure Rules

orphan-keys: Identifies keys in translation files that are not referenced anywhere in the source code. Orphan keys accumulate when features are removed but translation files are not cleaned up. They waste translator effort and create confusion about what is actively used.

duplicate-keys: Detects the same key defined multiple times within a single file or namespace. Duplicates cause unpredictable behavior — the translation engine uses one of them, but which one depends on implementation details.

naming-consistency: Checks that key names follow consistent patterns. If most keys use snake_case, a key using camelCase is flagged. Inconsistent naming makes keys harder to find and maintain.

Code Rules

hardcoded-jsx: Uses AST parsing to detect string literals inside JSX elements. <h1>Welcome</h1> is flagged; <h1>{t('welcome')}</h1> is not. This rule understands JSX and ignores non-user-facing strings like CSS class names and data attributes.

hardcoded-template: Detects string literals passed to functions that typically produce user-facing output — toast notifications, alert dialogs, error messages. showToast("Operation successful") is flagged.

hardcoded-constant: Identifies string constants assigned to variables with user-facing names (like errorMessage, label, title, placeholder) that are not wrapped in translation functions.

Platform Dashboard

Reports submitted via --report are visualized in the better-i18n platform dashboard.

A time-series chart shows your health score over time. Each point represents a Doctor report, plotted by date. You can filter by branch to see the health trajectory of main versus feature branches. The trend line makes it easy to see whether your i18n quality is improving, stable, or degrading.

Category Breakdown

Drill into each category to see which rules are passing and which are failing. For each failing rule, you can see the specific keys and files involved. Click on a key to open it in the translation editor; click on a file to see it in the context of your repository.

Cross-Project Comparison

For organizations with multiple projects, the dashboard shows health scores across all projects. This is useful for identifying which projects need i18n attention and for setting organization-wide quality standards.

Alerts

Configure alerts to be notified when your health score drops below a threshold:

  • Email: Weekly digest of health score changes
  • Slack: Instant notification when a report fails
  • Webhook: Custom integration for your monitoring stack

Practical Examples

Example 1: Catching Missing Translations Before Release

Your team added a new checkout flow with 30 new translation keys. The developer added all keys to the English file. French and German translations were requested but not yet completed. Without Doctor, this ships with raw keys visible to French and German users.

With Doctor in CI:

Coverage     60/100  ████████████░░░░░░░░  30 missing keys (fr, de)

Error: 30 keys missing in target languages
  checkout.confirm_order — missing in: fr, de
  checkout.payment_method — missing in: fr, de
  checkout.shipping_address — missing in: fr, de
  ... (27 more)

Result: FAILED (score 60, threshold 80)

The PR is blocked. The developer sees the inline annotations, requests translations, and the PR merges only after translations are complete.

Example 2: Detecting Placeholder Mismatches

A translator updates the German translation for a notification string. The English source has You have {count} new messages from {sender}. The German translation accidentally uses {anzahl} instead of {count}.

Doctor catches this:

Quality      75/100  ███████████████░░░░░  1 placeholder mismatch

Error: Placeholder mismatch in notifications.new_messages (de)
  Source placeholders: {count}, {sender}
  Target placeholders: {anzahl}, {sender}
  Missing: {count}
  Extra: {anzahl}

Example 3: Cleaning Up After a Feature Removal

Your team removed the legacy dashboard six months ago. The code is gone, but the translation files still contain 85 keys under the legacy_dashboard namespace. Translators occasionally update these strings when doing bulk translation passes, wasting effort on content no one sees.

Doctor finds the orphan keys:

Structure    65/100  █████████████░░░░░░░  85 orphan keys

Warning: 85 keys in namespace "legacy_dashboard" are not referenced in source code
  legacy_dashboard.welcome — not referenced
  legacy_dashboard.stats_header — not referenced
  legacy_dashboard.chart_title — not referenced
  ... (82 more)

Example 4: Finding Hardcoded Strings

A new developer joins the team and writes a feature without using the translation system. They hardcode all strings directly in JSX:

// Before Doctor
<div>
  <h2>Account Settings</h2>
  <p>Manage your account preferences below.</p>
  <button>Save Changes</button>
</div>

Doctor flags every hardcoded string:

Code         40/100  ████████░░░░░░░░░░░░  3 hardcoded strings

Warning: Hardcoded string in JSX element (src/pages/Settings.tsx:15)
  <h2>Account Settings</h2>
  Suggestion: <h2>{t('settings.account_title')}</h2>

Warning: Hardcoded string in JSX element (src/pages/Settings.tsx:16)
  <p>Manage your account preferences below.</p>

Warning: Hardcoded string in JSX element (src/pages/Settings.tsx:17)
  <button>Save Changes</button>

Comparison with Alternatives

Manual JSON Diffing: Teams that compare translation files manually catch coverage issues but miss everything else — placeholder mismatches, orphan keys, hardcoded strings. Manual checks are also error-prone and do not scale beyond a handful of languages.

ESLint i18n Plugins: Linting rules like eslint-plugin-i18next catch hardcoded strings in JSX but do not check translation file quality, coverage across languages, or structural issues. Doctor includes code analysis as one of four categories and covers the full spectrum of i18n issues.

Phrase QPS (Quality Performance Score): Phrase provides a translation quality score, but it focuses on linguistic quality (grammar, terminology) rather than technical quality (missing keys, placeholder mismatches, orphan keys). Doctor focuses on the technical dimension — the issues that cause runtime errors and broken UIs.

No Automated Checks: Many teams have no automated i18n checks at all. Issues are discovered by users or QA engineers. Doctor provides comprehensive automated coverage that catches issues before they reach any environment.

Frequently Asked Questions

How long does a Doctor scan take? A typical scan of a project with 10,000 keys, 8 languages, and 200 source files completes in under 10 seconds. Code analysis (AST parsing) is the slowest category — use --skip-code for faster scans when you only need coverage and quality checks.

Can I run Doctor without connecting to the better-i18n platform? Yes. Doctor runs entirely locally by default. The --report flag is optional and only needed if you want to submit results to the platform for trend tracking.

Which frameworks does code analysis support? Code analysis currently supports React (JSX/TSX), Vue (SFC templates), and Svelte components. Angular support is planned. Framework detection is automatic based on your project's dependencies.

Can I add custom rules? Custom rules are on the roadmap. Currently, you can configure rule severity (error vs. warning) and disable specific rules that are not relevant to your project.

Does Doctor work with monorepos? Yes. Doctor supports workspace-aware scanning. It detects workspace boundaries and scans each package independently, producing a per-package report and an aggregated overall score.

How does the GitHub Actions workflow creation work? In the better-i18n dashboard, the Create Doctor Workflow action uses the GitHub API to create a pull request on your repository with a pre-configured .github/workflows/i18n-doctor.yml file. You review and merge the PR to activate the workflow.

Start Monitoring Your i18n Health

Translation problems should be caught in CI, not by users. better-i18n Doctor gives your team a continuous, automated health check that scores every dimension of your i18n quality and blocks broken translations from shipping.

Start your free trial and run your first Doctor scan in under five minutes.