An in-memory rate limiter is often good enough in a single long-lived process.
It stops being trustworthy once the same endpoint is served by many ephemeral instances.
That is why serverless deployments usually need rate limiting backed by shared state.
Why Local Counters Fail
If each function instance keeps its own request counts, the limit applies per instance, not per client identity.
That means a distributed deployment can accidentally multiply the effective rate limit just by scaling out.
This is not a bug in the library. It is the natural result of local state in a distributed system.
Why Redis Fits
Redis is useful here because it gives you:
- fast shared state
- atomic operations
- expiration primitives
That is enough to implement fixed-window, sliding-window, or token-bucket approaches depending on what fairness and cost model you need.
Use the Simplest Algorithm That Matches the Need
You do not always need a sliding-window log.
- fixed window is simple and often enough
- sliding window is fairer but more expensive
- token bucket is good when controlled burst capacity is acceptable
The correct choice depends on product behavior, not on which algorithm sounds most advanced.
Keep the Decision Atomic
If multiple Redis commands are required to evaluate and update the limit, keep them atomic with either:
- a Lua script
- a transaction pattern that truly fits the algorithm
That matters under concurrency.
The Practical Rule
For distributed and serverless APIs:
- store rate-limit state centrally
- pick an algorithm that matches the user experience you want
- keep the update path atomic
That is more important than which framework middleware you started with.
Further Reading