Tutorials

Language Adaptation in Software: Going Beyond Word-for-Word Translation

Eray Gündoğmuş
Eray Gündoğmuş
·9 min read
Share
Language Adaptation in Software: Going Beyond Word-for-Word Translation

Language Adaptation in Software: Going Beyond Word-for-Word Translation

Key Takeaways

  • Language adaptation addresses grammar, plurals, gender, formality, and cultural context — not just vocabulary
  • ICU MessageFormat provides a standard syntax for handling plurals, gender, and select expressions across languages
  • Different languages have different plural rules — Arabic has 6 plural forms, while Japanese has none
  • Right-to-left (RTL) languages require UI mirroring, not just text direction changes
  • Proper language adaptation reduces user confusion and improves product adoption in target markets

What Is Language Adaptation?

Language adaptation is the process of adjusting software content to fit the linguistic rules, cultural norms, and user expectations of each target locale. It goes beyond translation (replacing words in one language with words in another) to address how languages differ structurally.

For example, English uses "1 item" vs "2 items" (two plural forms), while Polish has three forms: "1 element", "2 elementy", "5 elementów". Russian has similar complexity. Arabic has six distinct plural categories. Handling this correctly requires more than a dictionary lookup.

Plural Rules

Plural handling is one of the most visible language adaptation challenges in software. The Unicode CLDR (Common Locale Data Repository) defines six plural categories: zero, one, two, few, many, and other.

Examples by Language

LanguageCategories UsedExample
Englishone, other1 file, 2 files
Frenchone, other1 fichier, 2 fichiers (but: "one" applies to 0 and 1)
Polishone, few, many, other1 plik, 2 pliki, 5 plików
Arabiczero, one, two, few, many, otherAll 6 forms used
JapaneseotherOnly one form (no plurals)
Russianone, few, many, other1 файл, 2 файла, 5 файлов

ICU MessageFormat for Plurals

{count, plural,
  one {# file selected}
  other {# files selected}
}

For Polish, the same key would use:

{count, plural,
  one {# plik wybrany}
  few {# pliki wybrane}
  many {# plików wybranych}
  other {# pliku wybranego}
}

The i18n library resolves which category to use based on the locale's plural rules defined in CLDR.

Gender and Grammatical Agreement

Many languages have grammatical gender that affects articles, adjectives, and verb forms. In French, "The file is deleted" translates differently depending on context:

  • Masculine: "Le fichier est supprimé"
  • Feminine: "La photo est supprimée"

ICU MessageFormat handles this with select:

{gender, select,
  male {Le fichier est supprimé}
  female {La photo est supprimée}
  other {L'élément est supprimé}
}

Applications need to pass gender context alongside the translated string so the correct form can be selected.

Formality Levels

Languages like Japanese, Korean, German, and Spanish distinguish between formal and informal address. This affects:

  • Pronouns: German "du" (informal) vs "Sie" (formal)
  • Verb conjugations: Spanish "tú tienes" vs "usted tiene"
  • Entire sentence structures: Japanese keigo (honorific language) changes sentence construction

Software targeting these markets needs to decide on a formality level and apply it consistently. Most B2B software uses formal address, while consumer apps may choose informal.

Some applications allow users to select their preferred formality level in settings, which requires maintaining parallel translation sets.

Text Expansion and Contraction

When translating from English to other languages, text length changes significantly:

Target LanguageTypical Expansion
German+30% longer
Finnish+30-40% longer
French+15-20% longer
Chinese-30% shorter
Japanese-20-30% shorter
Korean-10-20% shorter

This affects UI layout, button sizes, table columns, and navigation menus. Proper language adaptation requires:

  • Flexible layouts that accommodate text expansion
  • Testing with the longest target language (often German or Finnish)
  • Avoiding fixed-width containers for translatable text
  • Using CSS techniques like text-overflow: ellipsis as a last resort, not a default

Right-to-Left (RTL) Languages

Arabic, Hebrew, Persian, and Urdu are written right-to-left. Adapting for RTL involves more than setting dir="rtl" on the HTML element:

  • UI mirroring: Navigation, sidebars, and icon positions should mirror
  • Bidirectional text: Mixed LTR/RTL content (e.g., English brand names in Arabic text) requires proper Unicode Bidirectional Algorithm handling
  • Icons with directionality: Arrow icons, progress indicators, and "back" buttons need to flip
  • Numbers: Arabic-Indic numerals (٠١٢٣) may be expected in some contexts, while Western numerals (0123) are used in others
  • CSS logical properties: Use margin-inline-start instead of margin-left for automatic RTL support
/* Instead of: */
.sidebar { margin-left: 20px; }

/* Use logical properties: */
.sidebar { margin-inline-start: 20px; }

Date, Time, and Number Formatting

Locale-aware formatting is a critical part of language adaptation:

FormatUS EnglishGermanJapanese
Date03/02/202602.03.20262026年3月2日
Number1,234.561.234,561,234.56
Currency$1,234.561.234,56 €¥1,235
Time3:30 PM15:3015:30

The Intl API in JavaScript handles most formatting automatically:

new Intl.NumberFormat('de-DE', {
  style: 'currency',
  currency: 'EUR'
}).format(1234.56)
// → "1.234,56 €"

Practical Implementation Tips

  1. Use ICU MessageFormat: It handles plurals, gender, and select expressions in a standard way across languages
  2. Never concatenate translated strings: "Welcome, " + name + "!" breaks in languages where word order differs
  3. Externalize all user-facing text: Even error messages and validation text need translation
  4. Test with pseudo-localization: Use tools that expand text and add accented characters to catch layout issues early
  5. Provide context for translators: A string like "Open" could be a verb ("Open the file") or adjective ("The file is open") — context determines the correct translation
  6. Use CSS logical properties: They automatically handle RTL layout without separate stylesheets
  7. Validate with native speakers: Automated tools catch formatting issues, but cultural appropriateness requires human review

FAQ

What is the difference between translation and language adaptation?

Translation converts text from one language to another. Language adaptation goes further by adjusting grammar (plurals, gender), formatting (dates, numbers, currencies), UI layout (RTL support, text expansion), and cultural context (formality levels, color meanings, idioms) to create a natural experience for each locale.

How does ICU MessageFormat help with language adaptation?

ICU MessageFormat is a standard syntax supported by most i18n libraries (react-intl, vue-i18n, better-i18n) that lets translators handle plurals, gender selection, and conditional text within a single translation string. Instead of developers writing if/else logic for each language's rules, the translator writes the appropriate MessageFormat pattern, and the library resolves it at runtime.

Do I need separate codebases for RTL languages?

No. Modern CSS logical properties (margin-inline-start, padding-block-end, etc.) and the dir="rtl" HTML attribute handle most RTL layout automatically. The key is to avoid hardcoded directional values like margin-left in favor of logical equivalents from the start of your project.