Engineering

Database Localization: Strategies for Storing Multilingual Data

Eray Gündoğmuş

March 2, 2026·15 min read

Share

Database Localization: Strategies for Storing Multilingual Data

When building a multilingual application, one of the earliest architectural decisions is how to store translated content in the database. The approach you choose affects query performance, schema complexity, and how easily you can add new languages.

There is no single "correct" design — the best choice depends on the number of languages, volume of translatable content, query patterns, and team preferences. This guide covers the three most common strategies with practical examples.

The Problem

Consider a product catalog with names, descriptions, and categories. In a single-language application, the schema is straightforward:

CREATE TABLE products (
  id SERIAL PRIMARY KEY,
  name VARCHAR(255) NOT NULL,
  description TEXT,
  price DECIMAL(10,2),
  created_at TIMESTAMP DEFAULT NOW()
);

Once you need multiple languages, every text column becomes a localization challenge. The price and created_at columns are language-independent, but name and description need separate values for each locale.

Strategy 1: Column Per Locale

The simplest approach adds a column for each language:

CREATE TABLE products (
  id SERIAL PRIMARY KEY,
  name_en VARCHAR(255) NOT NULL,
  name_de VARCHAR(255),
  name_fr VARCHAR(255),
  name_ja VARCHAR(255),
  description_en TEXT,
  description_de TEXT,
  description_fr TEXT,
  description_ja TEXT,
  price DECIMAL(10,2),
  created_at TIMESTAMP DEFAULT NOW()
);

Querying

-- Fetch product with German translations, falling back to English
SELECT
  id,
  COALESCE(name_de, name_en) AS name,
  COALESCE(description_de, description_en) AS description,
  price
FROM products
WHERE id = 42;

Advantages

Simple queries — No joins required. All data is in a single row.
Easy to understand — The schema is self-documenting.
Good read performance — Single table scan with no joins.

Disadvantages

Schema changes for new languages — Adding a language requires an ALTER TABLE for every translatable column. For a table with 5 translatable fields and 10 languages, that's 50 columns.
Sparse data — Most rows will have NULL values for languages without translations.
Hard to scale — Query complexity grows linearly with the number of languages.

Best for

Applications with a small, fixed set of languages (2-4) that rarely changes. Simple content models with few translatable fields.

Strategy 2: Separate Translation Table

This approach normalizes translations into a dedicated table:

CREATE TABLE products (
  id SERIAL PRIMARY KEY,
  price DECIMAL(10,2),
  created_at TIMESTAMP DEFAULT NOW()
);

CREATE TABLE product_translations (
  id SERIAL PRIMARY KEY,
  product_id INTEGER NOT NULL REFERENCES products(id) ON DELETE CASCADE,
  locale VARCHAR(10) NOT NULL,
  name VARCHAR(255) NOT NULL,
  description TEXT,
  UNIQUE (product_id, locale)
);

CREATE INDEX idx_product_translations_lookup
  ON product_translations (product_id, locale);

Querying

-- Fetch product with French translations, falling back to English
SELECT
  p.id,
  COALESCE(t.name, fallback.name) AS name,
  COALESCE(t.description, fallback.description) AS description,
  p.price
FROM products p
LEFT JOIN product_translations t
  ON t.product_id = p.id AND t.locale = 'fr'
LEFT JOIN product_translations fallback
  ON fallback.product_id = p.id AND fallback.locale = 'en'
WHERE p.id = 42;

Advantages

No schema changes for new languages — Adding a language is just inserting new rows.
Clean separation — Language-independent data stays in the main table.
Flexible — Easy to query which languages are available for any entity.
Standard pattern — Well-understood by ORMs and frameworks. Rails, Django, and Laravel all have libraries supporting this pattern.

Disadvantages

Join required — Every query for translated data needs at least one JOIN.
Fallback complexity — Language fallback chains (e.g., pt-BR → pt → en) require multiple JOINs or application logic.
More tables — Each translatable entity needs its own translation table.

Best for

Applications with many languages, content models that grow over time, and teams that value clean database normalization.

Strategy 3: JSON Column

Modern databases (PostgreSQL 9.4+, MySQL 5.7+, SQLite 3.38+) support JSON columns:

CREATE TABLE products (
  id SERIAL PRIMARY KEY,
  name JSONB NOT NULL DEFAULT '{}',
  description JSONB DEFAULT '{}',
  price DECIMAL(10,2),
  created_at TIMESTAMP DEFAULT NOW()
);

-- Insert with translations
INSERT INTO products (name, description, price)
VALUES (
  '{"en": "Wireless Headphones", "de": "Kabellose Kopfhörer", "fr": "Écouteurs sans fil"}',
  '{"en": "Premium wireless headphones with noise cancellation.", "de": "Premium kabellose Kopfhörer mit Geräuschunterdrückung."}',
  79.99
);

Querying (PostgreSQL)

-- Fetch product with French translations, falling back to English
SELECT
  id,
  COALESCE(name->>'fr', name->>'en') AS name,
  COALESCE(description->>'fr', description->>'en') AS description,
  price
FROM products
WHERE id = 42;

Advantages

No schema changes — New languages are just new JSON keys.
No joins — All translations are in the same row.
Flexible — Different rows can have different sets of languages.
GIN indexing — PostgreSQL JSONB columns support GIN indexes for efficient querying.

Disadvantages

Weaker type safety — The database cannot enforce that every JSON object has the required keys or value types.
No referential integrity — Unlike the translation table approach, there's no foreign key to validate locale codes.
ORM support varies — Some ORMs don't handle JSON columns natively, requiring custom serialization.
Full-text search complexity — Searching across all languages in a JSON column requires specialized indexing.

Best for

Applications using PostgreSQL or MySQL with moderate translation volumes, teams that prefer minimal schema, and projects where translation completeness varies significantly across languages.

Comparison Summary

Criterion	Column Per Locale	Translation Table	JSON Column
New language cost	Schema migration	Insert rows	No change
Query complexity	Simple	JOIN required	Medium
Read performance	Excellent	Good (with index)	Good
Type safety	Full	Full	Partial
ORM support	Built-in	Widely supported	Varies
Best for	Few fixed languages	Many languages	Flexible models

Handling Locale Fallbacks

Regardless of which strategy you choose, your application needs a fallback chain for missing translations. A common pattern:

Exact match (fr-CA for Canadian French)
Language match (fr for generic French)
Default language (en for English)

This fallback logic is typically implemented in the application layer rather than in SQL, as it may involve user preferences and locale negotiation.

Frequently Asked Questions

Should translatable and non-translatable columns be in the same table?

It depends on your query patterns. If you frequently query translated content, keeping it close to the entity reduces joins. If you frequently query only non-translatable data (pricing, inventory), separating translations avoids loading unnecessary data.

How do I handle full-text search across languages?

PostgreSQL supports language-specific tsvector configurations. Create separate text search indexes for each language, or use the simple configuration for language-agnostic searching. With the translation table approach, you can create a tsvector column per row since each row already represents one language.

What about NoSQL databases?

Document databases like MongoDB naturally support the JSON approach — each document embeds translations as nested objects. This works well for content-heavy applications but lacks the referential integrity guarantees of relational databases.

How does this relate to UI string localization?

Database localization handles user-generated or admin-managed content (product names, page titles, CMS content). UI string localization — button labels, error messages, navigation text — is typically handled by i18n libraries (react-intl, vue-i18n, etc.) using JSON or other file-based formats. Most applications need both.

Database Localization: Strategies for Storing Multilingual Data

Database Localization: Strategies for Storing Multilingual Data

The Problem

Strategy 1: Column Per Locale

Querying

Advantages

Disadvantages

Best for

Strategy 2: Separate Translation Table

Querying

Advantages

Disadvantages

Best for

Strategy 3: JSON Column

Querying (PostgreSQL)

Advantages

Disadvantages

Best for

Comparison Summary

Handling Locale Fallbacks

Frequently Asked Questions

Should translatable and non-translatable columns be in the same table?

How do I handle full-text search across languages?

What about NoSQL databases?

How does this relate to UI string localization?

Related Posts

Online Translation Tools for Developers: Beyond Google Translate

AI-Powered Translation Workflows: From Machine Translation to Post-Editing

How Better i18n Secures Enterprise Translation Workflows: Auth, Encryption & Compliance

Explore More

For Developers

For Translators

For Product Teams

All Features