Introduction To API Throttling | Openbridge Help Center

The term often used to describe when a service offers limited API capacity and multiple clients are far exceeding those limits, causing your service to be unable to make requests, is "API throttling" or simply "throttling." When an API is being throttled, the rate of incoming requests surpasses the permissible limits set by the API provider, leading to some requests being denied or delayed.

Why do APIs use Throttling?

The concept of "throttling" is also used in a more technical sense to refer to the intentional slowing or limiting of requests to an API to prevent overuse, often implemented by the API provider themselves. They might cap the number of requests a client can make in a given time frame (e.g., 100 requests per hour). Once this limit is hit, any further requests from that client will be throttled until the time window resets. Regardless of success or error, a request is often counted against limits.

Example Throttling Scenarios

Our scenarios below highlight various throttling uses for the Amazon Reporting API, limiting report creation requests to 60 per hour. These scenarios describe the impacts of throttling on data processing timing.

Scenario 1: Multiple Applications

Context

You have authorized three distinct applications under the same account: one managed by Service A, another by Service B, and the third by Service C. Each application serves a different purpose, but they all rely on the Amazon Reporting API.

Issue

Service A's application, designed for purpose X, sends 50 report creation requests within the first 30 minutes of an hour. Service B sends 20 requests 15 minutes later. Service C sends 10 requests 5 minutes later.

Impact

All teams, especially Service C, face frequent throttling errors due to the surge in demand. The continuous competition for API access forces applications into a cycle of retries. This slows down the request and operational cycles for all teams and risks causing significant delays in their request timelines. The intensified competition among teams for the limited API capacity becomes a bottleneck, hindering efficiency and productivity.

This will cause significant delays in requesting reporting data from the API, often preventing reports from being requested.

Scenario 2: Aggressive, Exclusive Usage

Context

A Third-party application consumes most of the available API capacity. The third-party developer sends 58 requests in an hour, every hour.

Issue

This application views the available API capacity as theirs alone to consume, despite the API limit being a shared resource for all applications. As a result, this app chooses to consume the available capacity at the expense of others.

Impact

All other teams are left with 2 requests per hour. This severely affects other efficient APIs, leading to increased errors and delays. While Scenario 1 may lead to significant delays in requesting reporting data from the API, this scenario blocks any requests.

Scenario 3: Burst Requests during Peak Times

Context

Third-party developer increases the frequency of report requests during a major sales event to monitor sales performance.

Issue

In their bid to increase the velocity of getting reporting data, the developer sends a flurry of requests, quickly consuming the 60-request limit every hour over the course of a few days. The developer stops their requests at the end of the sales event, and the limits return to a baseline capacity.

Impact

As the limit is reached, other developers are blocked and unable to generate reports. This creates confusion and disruptions as it is not clear what is causing the increased error rates. However, once capacity returns to a normal baseline, the other developer requests are honored, and operations return to normal.

This can create temporary outages and gaps in data for the duration of the excessive API use.

Scenario 4: Ignoring API Limits

Context

A third-party application is "spamming" the Amazon Reporting API.

Issue

Due to a lack of knowledge, poor coding practices, or sheer negligence, the third-party application sets up a series of operations that continuously sends hundreds or thousands of creation requests to the API every hour, every day. There are no rate-limiting processes or checks in this application, so requests to the API are unbounded.

Impact

This application effectively bombards the API with requests at a rate far higher than permissible. This rapid-fire of requests is akin to a Distributed Denial of Service (DDOS) attack on the API. The API quickly hits its 60 requests per hour limit. All other legitimate and crucial requests from other developers are denied, causing them to face throttling errors. All data requests are blocked. This can cause an extended service outage until the offending developer's access is terminated.