Voice Cloning

Clone any voice from a short audio sample with stunning accuracy.

Overview

Voice Cloning lets you create a digital replica of any voice using as little as 30 seconds of reference audio. Cloned voices can be used across all Nur products including Text-to-Speech, Voice Agents, and AI Dubbing.

30-second minimum sample

Multi-language cloning

Emotion transfer

High fidelity

Custom fine-tuning

Commercial licensing

Quickstart

Clone a voice from an audio file and use it to generate speech immediately.

from nur import NurClient
client = NurClient()
# Clone a voice from a local file
voice = client.voices.clone(
    name="My Custom Voice",
    file="sample.wav",
    description="Professional narrator voice",
    language="en"
)
print(f"Voice ID: {voice.id}")
print(f"Similarity: {voice.similarity_score}")
# Use the cloned voice for TTS
audio = client.tts.generate(
    text="This is my cloned voice speaking!",
    voice_id=voice.id
)
audio.save("cloned_output.mp3")

Clone Voice

POST/v1/voices/clone

Create a cloned voice from an audio file or URL. The audio should contain clear speech with minimal background noise. Supports WAV, MP3, FLAC, and OGG formats. Provide either file or audio_url, but not both.

Parameter	Type	Description
nameREQUIRED	string	Display name for the cloned voice
file	file	Audio file upload (WAV, MP3, FLAC, OGG). Min 30s
audio_url	string	Public URL to an audio file for cloning
description	string	Human-readable description of the voice
language	string	Primary language of the voice (e.g. en, es, fr)

# Clone from a local file
curl -X POST https://api.nur.ai/v1/voices/clone \
  -H "Authorization: Bearer $NUR_API_KEY" \
  -F "name=My Custom Voice" \
  -F "file=@sample.wav" \
  -F "description=Professional narrator voice" \
  -F "language=en"
# Clone from a URL
curl -X POST https://api.nur.ai/v1/voices/clone \
  -H "Authorization: Bearer $NUR_API_KEY" \
  -H "Content-Type: application/json" \
  -d '{
    "name": "My Custom Voice",
    "audio_url": "https://example.com/sample.wav",
    "description": "Professional narrator voice",
    "language": "en"
  }'

Response

{
  "id": "voice_klm456",
  "name": "My Custom Voice",
  "description": "Professional narrator voice",
  "language": "en",
  "type": "cloned",
  "preview_url": "https://cdn.nur.ai/voices/voice_klm456/preview.mp3",
  "created_at": "2025-01-15T10:30:00Z",
  "similarity_score": 0.94
}

List Voices

GET/v1/voices

Retrieve a list of all available voices, including both built-in and cloned voices. Filter by language or type to narrow the results.

Parameter	Type	Description
language	string	Filter by language code (e.g. en, es, fr)
type	string	Filter by voice type: built-in or cloned
limit	integer	Number of voices to return. Defaults to 20, max 100

curl -X GET "https://api.nur.ai/v1/voices?language=en&type=cloned&limit=10" \
  -H "Authorization: Bearer $NUR_API_KEY"

Get Voice

GET/v1/voices/{voice_id}

Retrieve detailed information about a specific voice, including its preview URL and similarity score for cloned voices.

Parameter	Type	Description
voice_idREQUIRED	string	ID of the voice to retrieve

curl -X GET https://api.nur.ai/v1/voices/voice_klm456 \
  -H "Authorization: Bearer $NUR_API_KEY"

Delete Voice

DELETE/v1/voices/{voice_id}

Permanently delete a cloned voice. Built-in voices cannot be deleted. Any agents using this voice will fall back to the default voice.

Parameter	Type	Description
voice_idREQUIRED	string	ID of the cloned voice to delete

curl -X DELETE https://api.nur.ai/v1/voices/voice_klm456 \
  -H "Authorization: Bearer $NUR_API_KEY"

Response Objects

Reference for the objects returned by Voice Cloning endpoints.

Voice Object

Field	Type	Description
id	string	Unique voice identifier (prefixed with voice_)
name	string	Display name of the voice
description	string	Human-readable description
language	string	Primary language code
type	string	Voice type: built-in or cloned
preview_url	string	URL to a preview audio clip of the voice
created_at	string	ISO 8601 creation timestamp
similarity_score	number	Cloning accuracy score from 0 to 1 (cloned voices only)

Best Practices

Use high-quality reference audio

Record in a quiet environment with minimal background noise. A clear, consistent sample produces significantly better clones. Aim for 60-90 seconds of natural speech for optimal results.

Match the intended use case

If your cloned voice will be used for narration, use a narration-style sample. The model captures tone, pacing, and emotion from the reference, so the sample should reflect the desired output style.

Verify with the similarity score

After cloning, check the similarity_score in the response. Scores above 0.90 indicate excellent quality. If the score is below 0.85, try re-recording the sample with better audio conditions.

Respect licensing and consent

Always obtain explicit consent before cloning someone's voice. Use the commercial licensing capability to ensure compliance. Nur enforces voice verification for production deployments.