Voice and Music

These dashboard endpoints are session-authenticated. They use the caller’s active business context and return generated audio directly as audio/mpeg. They are not API-key endpoints. Generated audio responses are never cached by the browser or CDN. Saved voice profiles continue to expose only NearIQ profile IDs, not provider voice identifiers.

Selectable voices

The dashboard voice selector returns built-in narration presets, optional saved voice profiles for the active business, and a sanitized library list when premium voice generation is configured.

{
  "providerConfigured": true,
  "providerError": null,
  "voices": [
    {
      "id": "female",
      "label": "Female voice",
      "description": "Clear commercial narration",
      "category": "built_in",
      "previewText": "Welcome to Aster Grove Fitness...",
      "recommended": true
    }
  ],
  "libraryVoices": [
    {
      "id": "voice_library_id",
      "label": "Narrator",
      "description": null,
      "category": "professional",
      "previewUrl": "https://..."
    }
  ],
  "savedVoices": [
    {
      "id": "b8e41b98-8a18-4f98-a9d4-7c4a56a104c2",
      "name": "Owner voice",
      "isDefault": true,
      "lastUsedAt": "2026-05-29T14:00:00.000Z"
    }
  ]
}

Generate narration

POST /api/businesses/me/content/voice

Field	Type	Required	Notes
`text`	string	yes	Narration text, max 4,000 characters. Sensitive tokens and unsafe requests are blocked or redacted before provider calls.
`voice`	enum	no	Built-in preset: `female`, `male`, `warm_female`, `bright_female`, `calm_male`, or `energetic_male`.
`speechPace`	enum	no	`slow`, `normal`, or `fast`. The generated preview applies the selected pace using a clearly audible slow/fast range and the response includes the normalized speech rate for downstream composition.
`voiceProfileId`	UUID	no	Saved NearIQ voice profile ID for the active business.
`providerVoiceId`	string	no	Opaque voice-library ID returned by the voice list endpoint. Cannot be combined with `voiceProfileId`.
`previousText`	string	no	Optional surrounding copy for better continuity.
`nextText`	string	no	Optional surrounding copy for better continuity.

curl https://app.neariq.io/api/businesses/me/content/voice \
  -X POST \
  -H "Content-Type: application/json" \
  -d '{
    "text": "Try a coached strength session this week.",
    "voice": "bright_female",
    "speechPace": "fast"
  }' \
  --output voice.mp3

The response body is the MP3 audio. Metadata headers include X-NearIQ-Audio, X-NearIQ-Voice, X-NearIQ-Speech-Pace, X-NearIQ-Speech-Rate, and provider request/character-count metadata when available.

Generate background music

POST /api/businesses/me/content/music

Field	Type	Required	Notes
`prompt`	string	no	Optional music prompt, max 2,000 characters.
`music`	enum	no	`upbeat`, `calm`, `corporate`, `inspirational`, or `none`. Used to build a safe default prompt when `prompt` is omitted.
`durationSeconds`	number	no	3 to 120 seconds. Defaults to 15.
`forceInstrumental`	boolean	no	Defaults to true so video beds do not clash with narration.
`seed`	integer	no	Optional deterministic seed hint.

curl https://app.neariq.io/api/businesses/me/content/music \
  -X POST \
  -H "Content-Type: application/json" \
  -d '{
    "music": "upbeat",
    "durationSeconds": 15,
    "forceInstrumental": true
  }' \
  --output music.mp3

Access

Voice and music generation require Growth plan or higher. Organization members need manage_voice for narration and content_studio_generate for music. Saved voice cloning requires Agency plan or higher and the manage_voice permission.

AI Chat voice endpoints

AI Chat also uses session-authenticated voice endpoints:

POST /api/voice/transcribe accepts browser-recorded audio (webm, mp4/m4a, mp3, wav, or ogg) and returns { "text": "..." }.
POST /api/voice/speak accepts { "text": "..." } and returns generated audio for the Listen action. Repeated matching playback returns X-NearIQ-Audio-Cache: hit; cache hits do not spend another generation. Free-plan, permission-denied, over-limit, and unavailable-provider responses return JSON with fallback: "browser_voice" so the client can use browser speech fallback.

The Listen renderer is provider-configurable: teams can keep premium audio, switch to lower-cost generated audio, or force browser-only speech fallback without changing the chat client. Cached audio is scoped to the selected provider so provider changes never replay the wrong generated voice. The chat client records with the best browser-supported MIME type and falls back to browser speech recognition when available. Empty provider transcripts should surface as a visible “No speech detected” message unless the local transcript fallback captured speech.

API Reference

Endpoints

Webhooks

Voice and Music

Selectable voices

Generate narration

Generate background music

Access

AI Chat voice endpoints

​Selectable voices

​Generate narration

​Generate background music

​Access

​AI Chat voice endpoints

Selectable voices

Generate narration

Generate background music

Access

AI Chat voice endpoints