All about voice sessions
Sessions are the core unit of data: audio + transcript + metadata, stored as immutable records.
A Voice Session is the actual live interaction between the participant and the voice agent.
The behavior of the agent during the session is governed by the prompt associated with the Voice Link or Project.
What a session includes
- Audio recording
- Full transcription
- Timing and turn‑level metadata
- Session‑level labels and custom metadata
Outputs
Every Voice Session produces two primary outputs:
- Audio: the raw voice interaction
- Transcript: a structured, time‑aligned transcription
These outputs are designed to be stored, queried, summarized, and post‑processed by downstream systems. Most Lovable workflows consume transcripts rather than audio directly.
Session settings
Each Voice Link or Project can define session‑level configuration, including:
- Prompt: governs the entire conversation behavior
- Timeline: expected flow or duration of the session
- Landing title: what participants see before starting
- Landing description: context and instructions for the participant
- Labels: human‑readable tags for organization
- Metadata: custom key‑value pairs passed from your app
These settings let builders tightly control both the participant experience and the data produced.
Immutability
Sessions are immutable historical records and are designed to be queried, analyzed, and processed later.