ElevenLabs Speech Engine adds voice capabilities to any chat agent. ElevenLabs handles speech-to-text and text-to-speech while your server provides the LLM logic. The SDK manages connection lifecycle, turn-taking, and interruption detection so you can focus on your agent’s behavior.
Build a voice agent with the ElevenLabs SDK.
Classes, methods, and events for the JavaScript SDK.
Classes, methods, and events for the Python SDK.
Speech Engine connects your server to ElevenLabs over WebSocket. Each connection represents one conversation.
Speech Engine is designed for developers who want to bring their own LLM and control the conversation logic on their own server. Use it when you need to:
If you want a fully hosted solution where ElevenLabs provides the LLM, knowledge base, and tools, use ElevenAgents instead.
AbortSignal (TypeScript) or task cancellation (Python).Any LLM that produces text. The SDK has built-in stream extraction for OpenAI (Responses API and Chat Completions API), Anthropic Messages API, and Google Gemini API. For other providers, pass a plain string or an async iterable of string chunks.
ElevenAgents is a fully hosted platform where ElevenLabs provides the LLM, knowledge base, and tools. Speech Engine is for developers who want to bring their own LLM and control the conversation logic on their own server.
In TypeScript, you can attach Speech Engine to any Node.js HTTP server (Express, Fastify, or
plain http.createServer()), or run a standalone WebSocket server. In Python, the SDK provides
a standalone server via engine.serve(), or you can integrate with FastAPI, Starlette, or any
ASGI framework using engine.create_session().