Skip to content

Rate limits

The gateway enforces rate limits at two levels: per-app global limits and per-session/per-room command cooldowns.

Every API request counts against a per-app request quota. The rate limit uses a fixed-window counter scoped to app ID and client IP address.

TypeDefault limitWindow
Authenticated requests10,000 requests per second1 second
Unauthenticated requests10 requests per second1 second

You can customize the rate limit for your app via the admin API:

Terminal window
curl -X POST https://your-gateway-host/admin/apps/my-app \
-H "X-Admin-API-Key: your-admin-key" \
-H "Content-Type: application/json" \
-d '{"rate_limit": 200}'

The /health endpoint is exempt from rate limiting.

Individual commands have per-session cooldowns to prevent overwhelming the gateway. Commands in the same rate group share a cooldown window: if you send a playback/pause, you must wait before sending a playback/stop (both are in the seek group).

Lightweight playback control operations. 100 ms cooldown, per-session.

playback/stop, playback/pause, playback/resume, playback/seek, playback/restart, record/stop.

Commands that fetch files and start playback. 500 ms cooldown, per-session.

playback/start, playback/silence.

Recording and composite commands that involve multi-step operations. 2 second cooldown, per-session.

record/start, play_and_get_digits.

Conference join and leave operations. 2 second cooldown, per-session.

conference/join, conference/leave.

Conference mute and unmute operations. 2 second cooldown, per-session.

conference/mute, conference/unmute.

Starting audio playback into a conference room. 2 second cooldown, per-room.

POST /conferences/{room_id}/play.

Conference playback controls that share a cooldown window. 200 ms cooldown, per-room.

POST /conferences/{room_id}/pause, POST /conferences/{room_id}/volume, POST /conferences/{room_id}/stop.

When a rate limit is exceeded, the gateway returns a 429 Too Many Requests response:

{
"success": false,
"error": "Rate limit exceeded"
}

For command-level cooldowns:

{
"success": false,
"error": "Command rate limited"
}
  • Respect cooldowns. Command cooldowns protect the gateway from overload. Sending commands faster than the cooldown results in 429 errors, not faster execution.
  • Use webhooks instead of polling. Rather than repeatedly querying state, wait for webhook events that tell you when an operation completes.
  • Batch where possible. The playback/start command accepts multiple URLs in a single request. Use this instead of sending separate start commands.
  • Implement exponential backoff. When you receive a 429, wait before retrying. A good starting point is the cooldown duration, then double the wait on subsequent 429s.
  • Monitor your usage. If you consistently hit global rate limits, increase your per-app rate_limit via the admin API.
  • Exempt commands exist. The answer, hangup, and disconnect commands have no command-level cooldown and can be sent at any time.