Skip to content
GitHub
Get started →

Pronunciation dictionary

OpenAI’s Realtime voices guess pronunciation from spelling. For common English words they’re great. For brand names, SKUs, acronyms, and foreign words, they fail — sometimes badly. The pronunciation dictionary lets you override specific words with phonetic spellings.

How it works

Each entry is a pair:

interface PronunciationEntry {
word: string; // the word to override
say_as: string; // the phonetic spelling
}

At session start, Spelo embeds your dictionary into the system prompt:

When you say the word “Hodos360”, pronounce it as “HOE-dose three-sixty”. When you say “Pacific Realty”, pronounce it as “Pacific Reel-tee” (two syllables, not three).

The model follows these instructions reliably.

Writing phonetic spellings

You don’t need IPA. Write it the way you’d write it in an email to someone phoneticizing a name for the first time:

WordSay asWhy
Hodos360HOE-dose three-sixtyThe AI otherwise says “Ha-doss three hundred sixty”
Atlas LegalAT-las LEE-gulDistinguishes from Atlas-as-in-map
PhofuhVietnamese noodle soup, not “foe”
macOSmac-oh-essNot “mac-oss”
Réalitéray-ah-lee-tayFrench brand name
EMRee-em-areAcronym; prevent the AI from pronouncing as a word

Guidelines:

  • Use hyphens between syllables. They become micro-pauses in speech.
  • Capitalize stressed syllables. HOE-dose not hoe-DOSE.
  • Don’t use IPA symbols. The AI gets confused. Plain English approximations work better.
  • Test out loud. Read your say_as aloud; if it sounds right, it’ll sound right to the AI.

Editing in the dashboard

  1. Dashboard → VoicePronunciations
  2. Click Add entry
  3. Fill in word (the thing in your data) and say_as (how to pronounce it)
  4. Save
  5. Test by clicking Preview voice and typing a sentence that contains the word

The preview uses the same system prompt as live sessions, so what you hear is what visitors will hear.

Bulk import

For dozens of entries (e.g. a real-estate firm with 200 street names), the dashboard has a Bulk import button that accepts CSV:

word,say_as
Hodos360,HOE-dose three-sixty
La Cienega,lah see-EN-uh-guh
Sepulveda,suh-PUHL-vuh-duh
Los Feliz,los FEE-liss

Programmatic updates

Update via the Sites API:

Terminal window
curl -X PATCH https://api.spelo.ai/v1/sites/<site_id> \
-H "Authorization: Bearer $SPELO_API_KEY" \
-H "Content-Type: application/json" \
-d '{
"pronunciations": [
{ "word": "Hodos360", "say_as": "HOE-dose three-sixty" },
{ "word": "La Cienega", "say_as": "lah see-EN-uh-guh" }
]
}'

Scope and limits

  • Entries are per-site. If you operate multiple sites, pronunciations don’t leak across.
  • Maximum 500 entries per site (Pro plan and above). Starter plan caps at 50.
  • The dictionary is injected into the system prompt, which consumes tokens. At 500 entries you’re spending ~3K tokens per session on pronunciations alone. We automatically deduplicate and sort by frequency in your content to prioritize the most-used entries if you hit the cap.

When pronunciation doesn’t stick

Occasionally the AI will still mispronounce a word despite the dictionary. Usually one of:

  • The word is embedded in a URL or code. The AI spells out URLs character-by-character; pronunciations don’t apply.
  • The phonetic spelling itself is ambiguous. Try a different spelling. HOE-dose works; HODOSE becomes “ho-dose-ee”.
  • Model limitation. Rarely, the model just gets stuck. Report at github.com/spelo/spelo/issues — we maintain a list of known problematic names and lobby OpenAI to improve them.

See also

  • Voices — different voices have different accents; some say words better than others
  • Languages — for non-English sites
  • Templates — industry templates ship with common pronunciations pre-filled