Engineering

ICU Message Format: Syntax, Patterns, and Implementation Guide

Eray Gündoğmuş
Eray Gündoğmuş
·14 min read
Share
ICU Message Format: Syntax, Patterns, and Implementation Guide

ICU Message Format: Syntax, Patterns, and Implementation Guide

ICU MessageFormat is a standard syntax for handling complex translatable strings — including plurals, gender-based variations, and conditional text. Originally developed as part of the International Components for Unicode (ICU) project, the format is now used across programming languages and platforms as the go-to solution for messages that change based on runtime values.

If you've ever needed to display "1 item" vs. "2 items" or "She liked your photo" vs. "They liked your photo" in multiple languages, ICU MessageFormat is designed for exactly this.

Why ICU MessageFormat Exists

Simple key-value translation formats work well for static strings, but they break down when a message depends on variables:

// Naive approach — breaks in many languages
"You have " + count + " new messages"

This concatenation approach fails because:

  • Word order varies by language. Japanese puts the number before the noun but after the verb. Arabic may restructure the entire sentence.
  • Plural rules differ. English has 2 plural forms (singular, other). Russian has 3. Arabic has 6. Polish has 4 with complex rules.
  • Gender affects surrounding words. In French, "nouveau" (new) becomes "nouvelle" for feminine nouns, and "nouveaux" for masculine plural.

ICU MessageFormat solves these problems with a declarative syntax that lets translators handle all variations in a single string.

Basic Syntax

An ICU message is a string containing plain text mixed with argument placeholders enclosed in curly braces:

Hello, {name}!

When processed, {name} is replaced with the provided value. The power comes from format types that add conditional logic:

{count, plural, one {# message} other {# messages}}

This tells the formatter: "Look at the count argument. If the plural category is one, output # message. Otherwise, output # messages." The # symbol is replaced with the formatted number.

Plural Patterns

Pluralization is ICU MessageFormat's most commonly used feature. The syntax is:

{variable, plural,
  =0 {No messages}
  one {# message}
  other {# messages}
}

CLDR Plural Categories

The Unicode CLDR (Common Locale Data Repository) defines six plural categories:

CategoryUsed byExample (English)
zeroArabic, Latvian, Welsh0 items
oneEnglish, French, German1 item
twoArabic, Hebrew, Slovenian2 items
fewRussian, Polish, Czech2-4 items
manyRussian, Polish, Arabic5-20 items
otherAll languages (required)21+ items

Not every language uses every category. English only needs one and other. The other category is always required as a fallback.

Exact Value Matching

Use =N to match exact numbers, which takes precedence over category matching:

{count, plural,
  =0 {Your inbox is empty}
  =1 {You have one new message}
  =42 {You have the answer to everything}
  other {You have # new messages}
}

Select Patterns

The select type chooses output based on a string value, commonly used for gender:

{gender, select,
  female {She liked your post}
  male {He liked your post}
  other {They liked your post}
}

The other case is always required and serves as the fallback.

Nested Patterns

ICU MessageFormat patterns can be nested for complex scenarios. For example, combining gender and plural:

{gender, select,
  female {{count, plural,
    one {She added # photo}
    other {She added # photos}
  }}
  male {{count, plural,
    one {He added # photo}
    other {He added # photos}
  }}
  other {{count, plural,
    one {They added # photo}
    other {They added # photos}
  }}
}

While powerful, deeply nested patterns are harder for translators to work with. Keep nesting to two levels when possible.

selectordinal Patterns

The selectordinal type handles ordinal numbers (1st, 2nd, 3rd, etc.):

{position, selectordinal,
  one {#st place}
  two {#nd place}
  few {#rd place}
  other {#th place}
}

This correctly produces "1st place", "2nd place", "3rd place", "4th place", and so on in English.

Implementation by Language

ICU MessageFormat is implemented in most programming ecosystems:

JavaScript / TypeScript

The intl-messageformat package (maintained by FormatJS) is the standard implementation:

import { IntlMessageFormat } from 'intl-messageformat';

const message = new IntlMessageFormat(
  '{count, plural, one {# item} other {# items}}',
  'en'
);

message.format({ count: 1 });  // "1 item"
message.format({ count: 5 });  // "5 items"

React applications commonly use react-intl (also part of FormatJS), which wraps intl-messageformat with React components:

<FormattedMessage
  id="cart.itemCount"
  defaultMessage="{count, plural, one {# item} other {# items}}"
  values={{ count: 3 }}
/>

Java

Java includes java.text.MessageFormat in the standard library, which supports ICU-style patterns:

import java.text.MessageFormat;

String pattern = "{0, plural, one {# message} other {# messages}}";
String result = MessageFormat.format(pattern, 5);
// "5 messages"

For full ICU4J support (including select and selectordinal), use the com.ibm.icu package.

PHP

The intl extension provides MessageFormatter:

$formatter = new MessageFormatter('en', '{count, plural, one {# item} other {# items}}');
echo $formatter->format(['count' => 1]); // "1 item"

Best Practices

Keep Messages Translatable

  • Avoid concatenating ICU messages — give translators the full sentence
  • Provide context comments explaining when each variant appears
  • Test with languages that have complex plural rules (Arabic, Polish) to verify your patterns are complete

Required Fallbacks

  • Always include the other category in plural and selectordinal
  • Always include the other case in select
  • These are required by the specification and serve as fallbacks for unexpected values

Formatting Numbers and Dates

ICU MessageFormat supports number and date formatting within messages:

{price, number, currency}
{date, date, medium}

The exact format depends on the locale, following CLDR conventions.

Frequently Asked Questions

What is the difference between ICU MessageFormat and gettext?

Gettext uses separate singular/plural forms with ngettext(), supporting only two forms per message. ICU MessageFormat handles all CLDR plural categories in a single string and adds gender selection, number formatting, and nesting. ICU is more expressive but has a steeper learning curve.

Do I need ICU MessageFormat for simple apps?

If your application only needs simple string interpolation (e.g., "Hello, {name}"), basic key-value formats work fine. ICU MessageFormat becomes valuable when you need pluralization, gender handling, or complex conditional text across multiple locales.

How do translators work with ICU messages?

Professional translators are generally familiar with ICU syntax. Translation management systems parse ICU messages and present each variant separately, making the format transparent to translators while preserving the structure for developers.

Is ICU MessageFormat the same as MessageFormat 2.0?

No. MessageFormat 2.0 (MF2) is a new specification being developed by the Unicode Consortium as a successor to ICU MessageFormat. MF2 addresses limitations of the original format, including better error handling and extensibility. As of 2025, MF2 has reached technical preview status but is not yet widely adopted in production libraries.