Documentation

Cerberus Docs

Everything you need to add production-grade rate limiting to your APIs. Sub-millisecond enforcement powered by atomic Redis Lua scripts.

Introduction#

Cerberus is an open-source, Redis-backed rate limiting service designed for modern API infrastructure. It provides sub-millisecond decision latency using atomic Lua scripts, supports multiple algorithms (sliding window and token bucket), and offers full multi-tenant isolation.

Deploy it as a sidecar, standalone service, or use the managed cloud version. Cerberus handles the complexity of distributed rate limiting so you can focus on your product.

Tip
Cerberus is fully open source under the MIT license. Self-host it on your own infrastructure with zero limitations.

Quickstart#

Get Cerberus running locally in under 5 minutes with Docker.

1. Start with Docker Compose#

bash
git clone https://github.com/agnij-dutta/cerberus.git
cd cerberus
docker compose up -d

2. Create a Tenant#

bash
curl -X POST http://localhost:8000/v1/tenants \
  -H "X-Admin-Key: your-admin-key" \
  -H "Content-Type: application/json" \
  -d '{"name": "my-app"}'

# Response:
# { "id": "...", "name": "my-app", "api_key": "cerb_abc123..." }
Warning
Save the api_key returned — it's only shown once. The key is hashed with SHA-256 and cannot be recovered.

3. Create a Rate Limit Policy#

bash
curl -X POST http://localhost:8000/v1/policies \
  -H "X-API-Key: cerb_abc123..." \
  -H "Content-Type: application/json" \
  -d '{
    "name": "api-default",
    "algorithm": "sliding_window",
    "limit": 100,
    "window_seconds": 60
  }'

4. Check a Request#

bash
curl -X POST http://localhost:8000/v1/check \
  -H "X-API-Key: cerb_abc123..." \
  -H "Content-Type: application/json" \
  -d '{"key": "user:42", "policy": "api-default"}'

# Response:
# {
#   "allowed": true,
#   "remaining": 99,
#   "limit": 100,
#   "reset_at": 1710100060
# }

Installation#

Python SDK#

bash
pip install cerberus-sdk
python
from cerberus import CerberusClient

client = CerberusClient(
    base_url="http://localhost:8000",
    api_key="cerb_abc123..."
)

# Check rate limit
result = client.check("user:42", "api-default")
if result.allowed:
    # Process request
    print(f"Remaining: {result.remaining}/{result.limit}")
else:
    # Rate limited — back off
    print(f"Retry after: {result.reset_at}")

TypeScript SDK#

bash
npm install @cerberus/sdk
typescript
import { Cerberus } from "@cerberus/sdk";

const cerberus = new Cerberus({
  baseUrl: "http://localhost:8000",
  apiKey: "cerb_abc123...",
});

const result = await cerberus.check("user:42", "api-default");

if (result.allowed) {
  console.log(`Remaining: ${result.remaining}/${result.limit}`);
} else {
  console.log(`Rate limited. Reset at: ${result.resetAt}`);
}

Rate Limiting#

Cerberus enforces rate limits using atomic Redis Lua scripts — no race conditions, no distributed locks, no approximations. Every decision completes in a single Redis round-trip.

Each check is identified by a composite key (e.g., user:42) and evaluated against a named policy. The response tells you whether the request is allowed and how many requests remain in the window.

Policies#

Policies define the rules for rate limiting. Each policy specifies an algorithm, a request limit, and a time window. Policies are scoped to your tenant — no cross-tenant leakage.

json
{
  "name": "api-default",
  "algorithm": "sliding_window",
  "limit": 100,
  "window_seconds": 60
}

// Token bucket example:
{
  "name": "burst-friendly",
  "algorithm": "token_bucket",
  "limit": 50,
  "window_seconds": 1,
  "burst_limit": 10
}

Algorithms#

Sliding Window#

The sliding window algorithm tracks requests across a rolling time window, providing smooth rate limiting without the "burst at boundary" problem of fixed windows. Cerberus implements this with a single Redis sorted set and an atomic Lua script.

Best for

APIs with steady traffic patterns

Complexity

O(log N) per check

Redis Structure

Sorted set (ZRANGEBYSCORE)

Boundary Issues

None

Token Bucket#

The token bucket algorithm allows controlled bursting. Tokens refill at a constant rate, and each request consumes one token. If the bucket is empty, the request is denied. The burst_limit parameter controls the maximum bucket size.

Best for

APIs allowing bursts (webhooks, uploads)

Complexity

O(1) per check

Redis Structure

Hash (HMSET)

Burst Support

Yes — configurable

Authentication#

All API requests require a valid API key passed in the X-API-Key header. Keys are prefixed with cerb_ for easy identification.

bash
# All requests must include your API key:
curl -H "X-API-Key: cerb_abc123..." \
  http://localhost:8000/v1/check
Note
API keys are hashed with SHA-256 before storage. Cerberus uses a prefix-based lookup (first 8 characters) for fast key resolution, then verifies the full hash.

Check Endpoint#

The core endpoint. Call it before processing any request to determine whether it should be allowed or rate-limited.

POST/v1/check

Request Body#

ParameterTypeDescription
keyrequiredstringUnique identifier for the rate limit subject (e.g., user ID, IP)
policyrequiredstringName of the rate limit policy to evaluate against
costintegerNumber of tokens to consume (default: 1)

Response#

ParameterTypeDescription
allowedbooleanWhether the request should be allowed
remainingintegerNumber of requests remaining in the current window
limitintegerTotal requests allowed per window
reset_atintegerUnix timestamp when the window resets
json
// 200 OK — Allowed
{
  "allowed": true,
  "remaining": 97,
  "limit": 100,
  "reset_at": 1710100060
}

// 200 OK — Rate Limited
{
  "allowed": false,
  "remaining": 0,
  "limit": 100,
  "reset_at": 1710100060
}
Tip
The check endpoint always returns 200 OK. It's your application's responsibility to act on the allowed field. Return 429 Too Many Requests to your clients when allowed is false.

Analytics#

Cerberus tracks request metrics per tenant. Query the analytics endpoint to get insights into your rate limiting traffic.

GET/v1/analytics
json
{
  "total_checks": 1542893,
  "allowed": 1498210,
  "denied": 44683,
  "denial_rate": 0.029,
  "p99_latency_ms": 0.8,
  "policies": {
    "api-default": { "checks": 982100, "denied": 31200 },
    "auth-strict": { "checks": 560793, "denied": 13483 }
  }
}