.dev
A bucket fills with tokens at a fixed rate. Each request consumes one token. When the bucket is empty, requests are rejected until new tokens accumulate. The bucket's capacity determines the maximum burst size -- the number of requests that can be served in rapid succession before the steady-state rate takes over.
ALGORITHM IA time window slides continuously across the request timeline. Rather than resetting counters at fixed intervals, the window calculates a weighted count based on the overlap between the current window and the previous fixed window. This eliminates the burst-at-boundary problem of fixed windows while maintaining simple arithmetic.
ALGORITHM IIRequests enter a queue of fixed capacity. The queue drains at a constant rate, regardless of how quickly requests arrive. If the queue is full, new requests are discarded. The leaky bucket smooths bursty traffic into a steady stream -- the output rate is guaranteed, even when the input rate is erratic.
ALGORITHM IIIThe request passes through. The token is consumed, the counter increments, and the response flows downstream. Headers carry the remaining quota: X-RateLimit-Remaining, X-RateLimit-Reset. The client proceeds, unaware of the gate it just passed through.
The request is turned away. The response carries status 429 Too Many Requests and a Retry-After header indicating when the client may try again. The limiter holds firm -- protecting the system from cascading overload, one rejected request at a time.