Skip to content

Rate Limits

Understand request quotas and how to optimize your API usage.

Overview

Rate limits protect the Nur API from abuse and ensure fair usage across all customers. Limits are applied at two levels: per-minute request limits that control the rate of incoming requests, and monthly usage quotas that cap the total volume of resources consumed within a billing cycle.

Your rate limits are determined by your account tier. When you exceed a limit, the API returns a 429 Too Many Requests response with a Retry-After header indicating how long to wait before retrying.

Limits by Tier

Each tier provides increasing capacity for requests, usage, and resources. Upgrade your plan at any time from the Nur Dashboard.

LimitFreeProBusinessEnterprise
Requests / min1001,0005,000Custom
Characters / month10,000500,0002,000,000Custom
Voice clones11050Custom
Dubbing minutes / month5120600Custom
Music tracks / month102001,000Custom

Per-Endpoint Limits

Some resource-intensive endpoints have additional per-endpoint rate limits that are lower than the global request limit. These apply on top of the tier-based limits.

EndpointFreeProBusiness
/v1/dubbing/create10 / min50 / min200 / min
/v1/voices/clone5 / min20 / min100 / min
/v1/music/generate10 / min100 / min500 / min
/v1/audio/enhance20 / min100 / min500 / min
/v1/tts/generate50 / min500 / min2,000 / min
/v1/stt/transcribe30 / min200 / min1,000 / min

Rate Limit Headers

Every API response includes rate limit headers so you can monitor your current usage and remaining capacity in real time.

1HTTP/1.1 200 OK
2X-RateLimit-Limit: 1000 # Maximum requests allowed per minute
3X-RateLimit-Remaining: 742 # Requests remaining in the current window
4X-RateLimit-Reset: 1706108460 # Unix timestamp when the window resets
HeaderTypeDescription
X-RateLimit-LimitIntegerThe maximum number of requests allowed in the current time window.
X-RateLimit-RemainingIntegerThe number of requests remaining before the limit is reached.
X-RateLimit-ResetUnix timestampThe time at which the current rate limit window resets.
Retry-AfterInteger (seconds)Seconds to wait before retrying. Only present on 429 responses.

Handling 429 Responses

When you receive a 429 status code, read the Retry-After header to determine how long to wait. If the header is not present, use exponential backoff starting at 1 second.

1import time
2import requests
3
4def make_request_with_backoff(url, headers, payload, max_retries=5):
5 """Make an API request with automatic retry on rate limits."""
6 for attempt in range(max_retries):
7 response = requests.post(url, headers=headers, json=payload)
8
9 if response.status_code == 429:
10 # Read the Retry-After header
11 retry_after = int(response.headers.get("Retry-After", 2 ** attempt))
12 remaining = response.headers.get("X-RateLimit-Remaining", "unknown")
13 print(f"Rate limited. Remaining: {remaining}. Waiting {retry_after}s...")
14 time.sleep(retry_after)
15 continue
16
17 response.raise_for_status()
18 return response.json()
19
20 raise Exception("Max retries exceeded due to rate limiting")
21
22# Usage
23result = make_request_with_backoff(
24 url="https://api.nur.ai/v1/tts/generate",
25 headers={"Authorization": "Bearer nur_your_api_key"},
26 payload={"text": "Hello world", "voice_id": "rachel_v2"},
27)

Best Practices

Follow these recommendations to make the most of your rate limits and avoid unnecessary throttling.

Batch requests when possible

Some endpoints support batch operations. For example, use the batch transcription endpoint to process multiple audio files in a single request instead of making individual calls.

Implement client-side caching

Cache voice lists, transcription results, and other responses that do not change frequently. This reduces redundant API calls and keeps you well within your limits.

Use webhooks instead of polling

For long-running operations like dubbing or voice cloning, register a webhook to receive a notification when the job completes instead of polling the status endpoint repeatedly.

Monitor rate limit headers

Read the X-RateLimit-Remaining header on every response and throttle your client when the value approaches zero. This is more efficient than reacting to 429 errors.

Spread requests evenly

Avoid burst patterns where you send many requests at once. Distribute your requests evenly across the rate limit window for smoother throughput.

Request a limit increase

If your use case requires higher limits, contact the Nur team or upgrade to a higher tier. Enterprise plans offer fully custom limits tailored to your needs.