Table of Contents
Table of Contents
- ICU Message Format: Syntax, Patterns, and Implementation Guide
- Why ICU MessageFormat Exists
- Basic Syntax
- Plural Patterns
- CLDR Plural Categories
- Exact Value Matching
- Select Patterns
- Nested Patterns
- selectordinal Patterns
- Implementation by Language
- JavaScript / TypeScript
- Java
- PHP
- Best Practices
- Keep Messages Translatable
- Required Fallbacks
- Formatting Numbers and Dates
- Frequently Asked Questions
- What is the difference between ICU MessageFormat and gettext?
- Do I need ICU MessageFormat for simple apps?
- How do translators work with ICU messages?
- Is ICU MessageFormat the same as MessageFormat 2.0?
ICU Message Format: Syntax, Patterns, and Implementation Guide
ICU MessageFormat is a standard syntax for handling complex translatable strings — including plurals, gender-based variations, and conditional text. Originally developed as part of the International Components for Unicode (ICU) project, the format is now used across programming languages and platforms as the go-to solution for messages that change based on runtime values.
If you've ever needed to display "1 item" vs. "2 items" or "She liked your photo" vs. "They liked your photo" in multiple languages, ICU MessageFormat is designed for exactly this.
Why ICU MessageFormat Exists
Simple key-value translation formats work well for static strings, but they break down when a message depends on variables:
// Naive approach — breaks in many languages
"You have " + count + " new messages"
This concatenation approach fails because:
- Word order varies by language. Japanese puts the number before the noun but after the verb. Arabic may restructure the entire sentence.
- Plural rules differ. English has 2 plural forms (singular, other). Russian has 3. Arabic has 6. Polish has 4 with complex rules.
- Gender affects surrounding words. In French, "nouveau" (new) becomes "nouvelle" for feminine nouns, and "nouveaux" for masculine plural.
ICU MessageFormat solves these problems with a declarative syntax that lets translators handle all variations in a single string.
Basic Syntax
An ICU message is a string containing plain text mixed with argument placeholders enclosed in curly braces:
Hello, {name}!
When processed, {name} is replaced with the provided value. The power comes from format types that add conditional logic:
{count, plural, one {# message} other {# messages}}
This tells the formatter: "Look at the count argument. If the plural category is one, output # message. Otherwise, output # messages." The # symbol is replaced with the formatted number.
Plural Patterns
Pluralization is ICU MessageFormat's most commonly used feature. The syntax is:
{variable, plural,
=0 {No messages}
one {# message}
other {# messages}
}
CLDR Plural Categories
The Unicode CLDR (Common Locale Data Repository) defines six plural categories:
| Category | Used by | Example (English) |
|---|---|---|
zero | Arabic, Latvian, Welsh | 0 items |
one | English, French, German | 1 item |
two | Arabic, Hebrew, Slovenian | 2 items |
few | Russian, Polish, Czech | 2-4 items |
many | Russian, Polish, Arabic | 5-20 items |
other | All languages (required) | 21+ items |
Not every language uses every category. English only needs one and other. The other category is always required as a fallback.
Exact Value Matching
Use =N to match exact numbers, which takes precedence over category matching:
{count, plural,
=0 {Your inbox is empty}
=1 {You have one new message}
=42 {You have the answer to everything}
other {You have # new messages}
}
Select Patterns
The select type chooses output based on a string value, commonly used for gender:
{gender, select,
female {She liked your post}
male {He liked your post}
other {They liked your post}
}
The other case is always required and serves as the fallback.
Nested Patterns
ICU MessageFormat patterns can be nested for complex scenarios. For example, combining gender and plural:
{gender, select,
female {{count, plural,
one {She added # photo}
other {She added # photos}
}}
male {{count, plural,
one {He added # photo}
other {He added # photos}
}}
other {{count, plural,
one {They added # photo}
other {They added # photos}
}}
}
While powerful, deeply nested patterns are harder for translators to work with. Keep nesting to two levels when possible.
selectordinal Patterns
The selectordinal type handles ordinal numbers (1st, 2nd, 3rd, etc.):
{position, selectordinal,
one {#st place}
two {#nd place}
few {#rd place}
other {#th place}
}
This correctly produces "1st place", "2nd place", "3rd place", "4th place", and so on in English.
Implementation by Language
ICU MessageFormat is implemented in most programming ecosystems:
JavaScript / TypeScript
The intl-messageformat package (maintained by FormatJS) is the standard implementation:
import { IntlMessageFormat } from 'intl-messageformat';
const message = new IntlMessageFormat(
'{count, plural, one {# item} other {# items}}',
'en'
);
message.format({ count: 1 }); // "1 item"
message.format({ count: 5 }); // "5 items"
React applications commonly use react-intl (also part of FormatJS), which wraps intl-messageformat with React components:
<FormattedMessage
id="cart.itemCount"
defaultMessage="{count, plural, one {# item} other {# items}}"
values={{ count: 3 }}
/>
Java
Java includes java.text.MessageFormat in the standard library, which supports ICU-style patterns:
import java.text.MessageFormat;
String pattern = "{0, plural, one {# message} other {# messages}}";
String result = MessageFormat.format(pattern, 5);
// "5 messages"
For full ICU4J support (including select and selectordinal), use the com.ibm.icu package.
PHP
The intl extension provides MessageFormatter:
$formatter = new MessageFormatter('en', '{count, plural, one {# item} other {# items}}');
echo $formatter->format(['count' => 1]); // "1 item"
Best Practices
Keep Messages Translatable
- Avoid concatenating ICU messages — give translators the full sentence
- Provide context comments explaining when each variant appears
- Test with languages that have complex plural rules (Arabic, Polish) to verify your patterns are complete
Required Fallbacks
- Always include the
othercategory inpluralandselectordinal - Always include the
othercase inselect - These are required by the specification and serve as fallbacks for unexpected values
Formatting Numbers and Dates
ICU MessageFormat supports number and date formatting within messages:
{price, number, currency}
{date, date, medium}
The exact format depends on the locale, following CLDR conventions.
Frequently Asked Questions
What is the difference between ICU MessageFormat and gettext?
Gettext uses separate singular/plural forms with ngettext(), supporting only two forms per message. ICU MessageFormat handles all CLDR plural categories in a single string and adds gender selection, number formatting, and nesting. ICU is more expressive but has a steeper learning curve.
Do I need ICU MessageFormat for simple apps?
If your application only needs simple string interpolation (e.g., "Hello, {name}"), basic key-value formats work fine. ICU MessageFormat becomes valuable when you need pluralization, gender handling, or complex conditional text across multiple locales.
How do translators work with ICU messages?
Professional translators are generally familiar with ICU syntax. Translation management systems parse ICU messages and present each variant separately, making the format transparent to translators while preserving the structure for developers.
Is ICU MessageFormat the same as MessageFormat 2.0?
No. MessageFormat 2.0 (MF2) is a new specification being developed by the Unicode Consortium as a successor to ICU MessageFormat. MF2 addresses limitations of the original format, including better error handling and extensibility. As of 2025, MF2 has reached technical preview status but is not yet widely adopted in production libraries.
Continue reading