Rate Limits

The primary constraint on Egret API usage is your plan's monthly credit quota. Infrastructure-level rate limits are enforced at the platform layer and are not currently exposed via response headers.

Credit quota

Every query consumes credits. When your quota is exhausted, query endpoints return 402 Payment Required until your billing period resets or you upgrade your plan.

Monitor your remaining credits at any time:

GET https://api.getegret.com/v1/billings/subscription/
Authorization: Bearer egret_...

Check current_quota.credits_remaining and current_quota.is_exhausted in the response. See Billing & Credits for the full response shape.

Per-plan quotas

PlanCredits/monthResets
Starter5,000Monthly on billing date
Professional20,000Monthly on billing date
EnterpriseCustomCustom

Infrastructure limits

Egret's query pipeline runs on AWS Bedrock, which enforces its own service quotas at the model and knowledge base level. These limits are abstracted from the API — you will not receive a rate-limit error from Bedrock directly. In practice, concurrent query throughput is constrained by your plan tier.

If you are building a high-throughput integration and need increased limits, contact us to discuss Enterprise options.

Handling quota exhaustion

When you receive a 402 response, either wait for your billing period to reset (current_quota.period_end) or upgrade your plan from the dashboard. We recommend checking credits_remaining proactively in applications where usage is high.