Features

Voice AI features for any website

Real-time voice, live page reading, on-page actions, cross-page continuity. Everything Spelo does, what it runs on, and how each piece works.

Real-time voice

Full-duplex voice that feels human

Spelo runs on real-time voice infrastructure designed for the latency of a phone call. Visitors talk and the assistant responds in well under a second, fast enough to interrupt naturally and clarify without the robotic pauses that make most voice products feel awkward.

Turns end the moment a visitor stops speaking. No "press to talk" buttons. No transcribe-then-reply round-trip. Just conversation.

Sub-second response latency, end-of-speech to first audio
Interruptible: stop the assistant mid-sentence and it listens
Smart turn-taking: the assistant knows when you've stopped
Tuned audio path: no awkward streaming delay

Live page reading

Reads your pages on every request

When a visitor asks something, Spelo scrapes the rendered HTML of the page they're on and sends a structured summary to the model: headings, sections, products, navigation links, CTAs, image alt text. Every answer is grounded in what is on the page right now.

No reindexing jobs. No vector store sync. Update a product price at 9:00 a.m. and the assistant knows by 9:00:01.

Scrapes the rendered DOM at conversation time, not at build time
Extracts headings, sections, products, navigation, images, CTAs
Handles every page builder: Elementor, Divi, Webflow CMS, Shopify Liquid, plain HTML
Zero data sync. The agent always sees the latest version.

Page navigation

Navigates and scrolls while it talks

When the conversation calls for a different page or section, the assistant moves there. "Show me your pricing": page changes. "Scroll to the FAQ": page scrolls. The visitor watches their intent happen on screen while the assistant explains what they're seeing.

On modern frameworks (Next.js, Vue, Astro with view transitions) the voice connection survives navigation seamlessly. The conversation continues mid-sentence as the page changes.

navigate(): moves to any page on the site
scroll_to(): scrolls to a named section, heading, or anchor
scroll_by(): handles vague directional commands ("scroll down a little")
Cross-page voice continuity via the SDK install (React, Next.js, Astro)

Form filling

Fills any form by voice

Spelo identifies form fields by label, placeholder, name, or surrounding context, then fills them in response to what the visitor says. Date and time inputs are auto-coerced: say "April 28th at 3 PM" and the agent emits the correct ISO date and 24-hour time the input accepts.

After fields are filled, the assistant clicks the submit button automatically. No need to dictate field names or pick from menus.

Matches fields by label, placeholder, name, ARIA, or adjacent <label>
Auto-coerces natural language into <input type="date"> and "time" formats
Handles dropdowns: speak the option, the agent selects it
Submits the form when filling is complete

On-page actions

Clicks any button by visible text

Tabs, filters, dropdowns, accordions, "Add to cart," "Confirm Booking": anything with a label, the assistant can click. No selectors to configure, no DOM IDs to maintain. The visible text is the contract.

Combine with form filling for end-to-end task completion: visitor says "book me Tuesday at 4," assistant fills date + time + clicks Book, page shows confirmation, assistant tells the visitor they're booked.

Identifies clickable elements by visible text: buttons, links, role="button"
Works on every UI library, none of them care if you have configurable selectors
Combines with form filling to complete bookings, signups, orders end-to-end
Ignores hidden or off-screen elements automatically

Voice continuity

Voice survives navigation between pages

On modern framework sites (Next.js, Vue, Astro with view transitions) the Spelo SDK keeps the voice connection alive across page changes. The assistant keeps talking through the navigation. No reconnect, no audio drop, no awkward pause.

On classic full-reload sites, the script tag auto-resumes the call on the new page. Same assistant, same conversation, picks up where it left off. The visitor never sees a "reconnecting" state.

SDK install: zero-drop voice across client-side navigation
Script tag install: auto-resume from one page to the next
The assistant remembers where the conversation was
The visitor experiences one continuous conversation across the whole site

Edge-glow UI

A pill that looks like part of your brand

The Spelo pill sits in the bottom-right corner. It expands when the visitor clicks, collapses when they're done, and pulses softly when the assistant is speaking. A blue glow around the page edge tells the visitor the assistant is listening or talking, visible in the corner of their eye, never in the way.

Mounted in a Shadow DOM, the widget can't collide with your CSS. Color and position are configurable from the dashboard.

Shadow DOM isolation: your stylesheet and ours never fight
Color theming, position, and orb size set from the dashboard
Edge-glow is OFF by default to stay quiet on visual-heavy sites
Reduced-motion friendly: every pulse stops if the visitor prefers reduced motion

Install anywhere

One product, every install path

Same engine, two install patterns. Plain HTML, classic WordPress, Shopify, Squarespace, Wix: paste a single async script tag. Modern frameworks (Next.js, Astro, Remix, Vite + React) install the @spelo/system SDK and drop <SpeloProvider /> (or <Spelo /> for Astro) into your root layout. The SDK ships real React and Astro components with full TypeScript types; the heavy runtime lazy-loads from our CDN on first orb interaction.

On install, Spelo detects the host framework and console-warns if you picked a pattern that won't survive your site's routing. No surprises, no support tickets six months later.

Script tag: <script src="https://spelo.ai/spelo.js" data-site-id="..." async>
React / Next.js: import { SpeloProvider } from "@spelo/system/react"
Astro: import Spelo from "@spelo/system/astro/Spelo.astro"
Runtime detection warns when the install pattern does not match the framework

Pick your engine

Two voice engines. One widget. Switch anytime.

Spelo runs on either OpenAI Realtime (gpt-realtime) or Google Gemini Live (gemini-2.5 native audio). Pick the one that fits your site from the dashboard. New signups default to Gemini for cost and language coverage. OpenAI stays available as the premium option, and existing customers can keep what they have.

Both engines fire the exact same tools (scroll, navigate, click, fill forms, database lookup) and serve the same pill widget to your visitors. The install snippet does not change. Switching engines is a toggle, not a reinstall.

OpenAI Realtime: 8 voices, English-first, premium quality, ~$0.30/min
Google Gemini Live: 30 voices, 97 languages, ~$0.023/min (~10× cheaper)
Same install snippet, same tools, same UX regardless of engine
Per-site toggle in the dashboard, switch anytime, no code change

Multilingual

97 languages on the Gemini engine

Sites running on Gemini Live can hold full-duplex voice conversations in 97 languages (Spanish, French, German, Hindi, Mandarin, Arabic, Portuguese, Japanese, Korean, and dozens more) out of the box. The agent matches the visitor's language automatically; you do not need to maintain separate widgets per locale.

OpenAI Realtime is English-first, with capable but uneven coverage of other languages. If your audience is global or non-English, Gemini is the right pick. If your audience is English-only and you want the highest voice quality, OpenAI is the right pick.

Gemini engine: 97 languages, auto-detected per visitor
OpenAI engine: English-first, with limited coverage of other languages
No separate widget per locale: one install, every language
Visitor speaks Spanish, the agent answers in Spanish

Cost efficiency

Voice that scales without burning the budget

Voice models are priced by the minute and the difference between vendors is large. Gemini Live runs at roughly $0.023 per minute of audio. OpenAI Realtime runs at roughly $0.30 per minute, about 10× more. For a site running thousands of minutes a month, that gap shows up as real money.

You do not have to choose between cost and capability. Both engines fire the same tools and present the same widget. Sites that need premium English voice run on OpenAI; sites that need volume or multilingual coverage run on Gemini.

~$0.023/min on Gemini vs ~$0.30/min on OpenAI
Same tools, same UX: only the underlying model changes
Switch engines anytime without reinstall or downtime
New signups default to Gemini for the cheaper baseline

Brand personality

Sound like your brand, not like ChatGPT

Set a voice, a name, a greeting, and a tone (formal, casual, warm, or witty). Spelo applies it across every conversation. Add pronunciation overrides for product names, founder names, anything the model might mispronounce.

Knowledge documents (PDFs, FAQs, policies) get embedded into the system prompt so the assistant draws from your real source-of-truth, not generic training data. Voice options scale with the engine you pick: 8 voices on OpenAI, 30 on Gemini.

OpenAI voices: 8 (alloy, echo, shimmer, and more), English-first
Gemini voices: 30, with multilingual coverage built in
Custom greeting and personality tone editable inline
Pronunciation table and knowledge documents apply across both engines

Built for production

Secure, scoped, and audit-friendly

Every conversation is signed and scoped to your site. Your site ID cannot be reused on another domain. Spelo enforces allowed-origins per site so impostors can't hijack your branded assistant. Rate limits are plan-aware. Conversations are recorded for review on plans that opt in.

No data leaves the visitor's browser without authentication. The same protections apply to every install path so you don't pay a security tax for picking a different framework.

Per-session signed authentication on every request
Allowed-origins enforcement: your site ID cannot be reused on another domain
Plan-aware rate limits prevent abuse
Optional conversation recording for plans that opt in

FAQ

Feature questions

Still curious? Email hello@spelo.ai . Usually a same-day reply.

Can I use Spelo on any website?

Yes. Spelo works on every site that renders HTML to a browser, including plain HTML, WordPress, Shopify, Squarespace, Wix, Webflow, Next.js, Astro, Vue, Svelte, Remix, and Gatsby. Pick the script tag for classic stacks or the SDK component for modern frameworks. Both bottom out in the same engine.

How is this different from a chatbot?

A chatbot waits for the visitor to type. Spelo is full-duplex voice: the visitor talks, the assistant responds in real voice, and the page updates in sync. Visitors watch their intent (something like "show me black running shoes in size 10") happen on screen as the assistant explains what they're seeing.

Does the voice keep going when the visitor navigates to another page?

On modern framework sites installed via the SDK (Next.js, Astro+ClientRouter, etc.), yes. The React tree is persistent, so the voice connection survives the page change with no reconnect or audio drop. On classic sites installed via the script tag, the agent stays in the voice session and a fresh page auto-resumes the call from sessionStorage in roughly one second.

How quickly does the assistant respond?

Roughly 300 milliseconds end-of-speech to first audio response, depending on network conditions and which engine you picked. Fast enough to interrupt naturally and clarify without feeling laggy. Both supported engines (OpenAI Realtime and Google Gemini Live) hit similar latency in practice; Spelo tunes server-side voice activity detection to a 200ms silence trigger on either path.

Can the assistant fill out forms?

Yes. Spelo identifies form fields by label, placeholder, name, or adjacent <label> and fills them based on what the visitor says. Date and time inputs auto-coerce natural language ("April 28th at 3 PM") into the formats those input types accept. After filling, the assistant clicks submit.

How does Spelo know what is on my pages?

On every conversation request, Spelo scrapes the rendered DOM of the page the visitor is on and sends a structured summary (headings, sections, products, navigation links, CTAs) to the model. No re-indexing jobs, no vector store sync. Update content at 9:00 a.m. and the assistant knows by 9:00:01.

Will it slow down my site?

The bootstrap script is roughly 40 KB minified. The voice library lazy-loads (about 500 KB) only when the visitor clicks the pill. First paint stays untouched. Lighthouse scores are unaffected when the visitor doesn't engage.

What about database or inventory lookup?

Live database lookup (Shopify products, Airtable, your CRM) is set up by the Spelo team as part of the Done-For-You add-on. Self-serve database connection has been disabled to ensure quality. We hand-tune the schema, write the queries, and verify accuracy on real questions before going live.

Stop letting visitors leave without a word.

Every visitor is a potential lead. Spelo turns silent visits into conversations and every conversation into a contact in your CRM. One script tag, one minute to install.

Get started free Read the docs