Endpoints
Voice and Music
Generate narration previews, commercial voiceover audio, and background music for Content Studio workflows.
POST
These dashboard endpoints are session-authenticated. They use the caller’s
active business context and return generated audio directly as
The response body is the MP3 audio. Metadata headers include
audio/mpeg.
They are not API-key endpoints.
Generated audio responses are never cached by the browser or CDN. Saved voice
profiles continue to expose only NearIQ profile IDs, not provider voice
identifiers.
Selectable voices
The dashboard voice selector returns built-in narration presets, optional saved voice profiles for the active business, and a sanitized library list when premium voice generation is configured.Generate narration
POST /api/businesses/me/content/voice
| Field | Type | Required | Notes |
|---|---|---|---|
text | string | yes | Narration text, max 4,000 characters. Sensitive tokens and unsafe requests are blocked or redacted before provider calls. |
voice | enum | no | Built-in preset: female, male, warm_female, bright_female, calm_male, or energetic_male. |
speechPace | enum | no | slow, normal, or fast. The generated preview applies the selected pace using a clearly audible slow/fast range and the response includes the normalized speech rate for downstream composition. |
voiceProfileId | UUID | no | Saved NearIQ voice profile ID for the active business. |
providerVoiceId | string | no | Opaque voice-library ID returned by the voice list endpoint. Cannot be combined with voiceProfileId. |
previousText | string | no | Optional surrounding copy for better continuity. |
nextText | string | no | Optional surrounding copy for better continuity. |
X-NearIQ-Audio, X-NearIQ-Voice, X-NearIQ-Speech-Pace,
X-NearIQ-Speech-Rate, and provider request/character-count metadata when
available.
Generate background music
POST /api/businesses/me/content/music
| Field | Type | Required | Notes |
|---|---|---|---|
prompt | string | no | Optional music prompt, max 2,000 characters. |
music | enum | no | upbeat, calm, corporate, inspirational, or none. Used to build a safe default prompt when prompt is omitted. |
durationSeconds | number | no | 3 to 120 seconds. Defaults to 15. |
forceInstrumental | boolean | no | Defaults to true so video beds do not clash with narration. |
seed | integer | no | Optional deterministic seed hint. |
Access
Voice and music generation require Growth plan or higher. Organization members needmanage_voice for narration and content_studio_generate for music. Saved
voice cloning requires Agency plan or higher and the manage_voice permission.
AI Chat voice endpoints
AI Chat also uses session-authenticated voice endpoints:POST /api/voice/transcribeaccepts browser-recorded audio (webm,mp4/m4a,mp3,wav, orogg) and returns{ "text": "..." }.POST /api/voice/speakaccepts{ "text": "..." }and returns generated audio for the Listen action.