Why are rate limits implemented?

Rate limits are a standard practice in APIs, serving various purposes:

Protection Against Abuse or Misuse: Rate limits safeguard the API from malicious attempts to overwhelm it with excessive requests, aiming to disrupt services. Maplerad uses rate limits to thwart such activities.

Ensuring Fair Access: To prevent a single user or organization from monopolizing resources, rate limits restrict the number of requests they can make. This ensures a fair opportunity for everyone to use the API without encountering slowdowns caused by excessive requests.

Managing Infrastructure Load: By setting rate limits, Maplerad can control the overall load on its infrastructure. This is crucial to prevent server overload and maintain consistent performance, especially during periods of increased API usage.

How do these rate limits function?
Rate limits are quantified in Requests Per Second (RPS) and are enforced based on whichever condition is met first.

Rate Limits in Headers

Information about rate limits, such as remaining requests and other metadata, is available in the HTTP response headers. Key header fields include:

Ratelimit-limit-requests: The maximum permitted number of requests before reaching the rate limit.
Ratelimit-remaining-requests: The remaining allowed requests before reaching the rate limit.

Error Mitigation

To mitigate potential issues

Exercise caution with programmatic access, bulk processing, etc. enabling them only for trusted customers.

Consider enforcing a hard cap or implementing a manual review process for users exceeding the limit.
Retrying with Exponential Backoff

To address rate limit errors

Consider automatic retries with a random exponential backoff strategy.
Retry unsuccessful requests after a brief sleep, gradually increasing the delay with each subsequent attempt.
Benefits include automatic recovery from rate limit errors without crashes or data loss, quick initial retries, and random delays to prevent simultaneous retries.
Note that continuous resending of unsuccessful requests contributes to the per-minute limit and is not a viable solution.