Rate limits

The gateway enforces rate limits at two levels: per-app global limits and per-session/per-room command cooldowns.

Per-app global rate limits

Every API request counts against a per-app request quota. The rate limit uses a fixed-window counter scoped to app ID and client IP address.

Type	Default limit	Window
Authenticated requests	10,000 requests per second	1 second
Unauthenticated requests	10 requests per second	1 second

You can customize the rate limit for your app via the admin API:

curl -X POST https://your-gateway-host/admin/apps/my-app \
  -H "X-Admin-API-Key: your-admin-key" \
  -H "Content-Type: application/json" \
  -d '{"rate_limit": 200}'

The /health endpoint is exempt from rate limiting.

Command rate groups

Individual commands have per-session cooldowns to prevent overwhelming the gateway. Commands in the same rate group share a cooldown window: if you send a playback/pause, you must wait before sending a playback/stop (both are in the seek group).

Seek group

Lightweight playback control operations. 100 ms cooldown, per-session.

playback/stop, playback/pause, playback/resume, playback/seek, playback/restart, record/stop.

Playback group

Commands that fetch files and start playback. 500 ms cooldown, per-session.

playback/start, playback/silence.

Heavy group

Recording and composite commands that involve multi-step operations. 2 second cooldown, per-session.

record/start, play_and_get_digits.

Conference group

Conference join and leave operations. 2 second cooldown, per-session.

conference/join, conference/leave.

Conference mute group

Conference mute and unmute operations. 2 second cooldown, per-session.

conference/mute, conference/unmute.

Conference play group

Starting audio playback into a conference room. 2 second cooldown, per-room.

POST /conferences/{room_id}/play.

Conference control group

Conference playback controls that share a cooldown window. 200 ms cooldown, per-room.

POST /conferences/{room_id}/pause, POST /conferences/{room_id}/volume, POST /conferences/{room_id}/stop.

429 response handling

When a rate limit is exceeded, the gateway returns a 429 Too Many Requests response:

{
  "success": false,
  "error": "Rate limit exceeded"
}

For command-level cooldowns:

{
  "success": false,
  "error": "Command rate limited"
}

Best practices

Respect cooldowns. Command cooldowns protect the gateway from overload. Sending commands faster than the cooldown results in 429 errors, not faster execution.
Use webhooks instead of polling. Rather than repeatedly querying state, wait for webhook events that tell you when an operation completes.
Batch where possible. The playback/start command accepts multiple URLs in a single request. Use this instead of sending separate start commands.
Implement exponential backoff. When you receive a 429, wait before retrying. A good starting point is the cooldown duration, then double the wait on subsequent 429s.
Monitor your usage. If you consistently hit global rate limits, increase your per-app rate_limit via the admin API.
Exempt commands exist. The answer, hangup, and disconnect commands have no command-level cooldown and can be sent at any time.