Build Faster, Smarter Realtime Agents: Runtime v0.8

Built for any use case: companions, language tutors, customer support, fitness trainers, games, and more.

Lower Latency Agents : Core runtime optimizations reduce latency significantly, making live multimodal agents feel snappier even under heavy LLM and TTS loads.
Instant streaming responses: Graph start is asynchronous now, enabling agents to begin streaming tokens or audio as soon as a run kicks off, eliminating awkward silence at the start of each interaction.
Smart early stopping: Cancel an agent run mid‑response for barge‑in, safety, or cost control so agents stop talking the moment the user or policy requires it. [Link to cancel execution]

For new users: Get Started

For returning users: Follow the 0.6 -> 0.8 migration guide

Production‑ready in one click

Launch from full example projects: Clone production‑ready multimodal templates directly into your project and go from idea to running agent in minutes.
Find the right template fast: Filter by input/output modality, SDK, and use case to jump straight to the patterns that match your product.

Start Building with Templates on Portal

Start Building with Templates on Website

Don’t find the template you’re looking for? Talk to our team to request one

API keys, onboarding, and application health in one place

Personalized onboarding by SDK: See a tailored “getting started” flow for your chosen SDK, so you only follow the steps that matter to your stack.
Everything you need, front and center: Jump straight to your API key, llms.txt, and templates from the home view, cutting setup time from minutes to seconds.
Holistic observability at a glance: View key traces and logs so you can catch regressions, debug incidents, and keep AI experiences reliable.

Personalized Onboarding

Observability at a Glance

Get Started

Build faster, smarter realtime agents - instant streaming, lower latency, and smart interruption handling