Documentation

Cerberus Docs

Everything you need to add production-grade rate limiting to your APIs. Sub-millisecond enforcement powered by atomic Redis Lua scripts.

5-min Quickstart

Get running in minutes

API Reference

Endpoints & params

Python SDK

pip install cerberus-sdk

Introduction#

Cerberus is an open-source, Redis-backed rate limiting service designed for modern API infrastructure. It provides sub-millisecond decision latency using atomic Lua scripts, supports multiple algorithms (sliding window and token bucket), and offers full multi-tenant isolation.

Deploy it as a sidecar, standalone service, or use the managed cloud version. Cerberus handles the complexity of distributed rate limiting so you can focus on your product.

Tip

Cerberus is fully open source under the MIT license. Self-host it on your own infrastructure with zero limitations.

Quickstart#

Get Cerberus running locally in under 5 minutes with Docker.

1. Start with Docker Compose#

bash

git clone https://github.com/agnij-dutta/cerberus.git
cd cerberus
docker compose up -d

2. Create a Tenant#

bash

curl -X POST http://localhost:8000/v1/tenants \
  -H "X-Admin-Key: your-admin-key" \
  -H "Content-Type: application/json" \
  -d '{"name": "my-app"}'

# Response:
# { "id": "...", "name": "my-app", "api_key": "cerb_abc123..." }

Warning

Save the api_key returned — it's only shown once. The key is hashed with SHA-256 and cannot be recovered.

3. Create a Rate Limit Policy#

bash

curl -X POST http://localhost:8000/v1/policies \
  -H "X-API-Key: cerb_abc123..." \
  -H "Content-Type: application/json" \
  -d '{
    "name": "api-default",
    "algorithm": "sliding_window",
    "limit": 100,
    "window_seconds": 60
  }'

4. Check a Request#

bash

curl -X POST http://localhost:8000/v1/check \
  -H "X-API-Key: cerb_abc123..." \
  -H "Content-Type: application/json" \
  -d '{"key": "user:42", "policy": "api-default"}'

# Response:
# {
#   "allowed": true,
#   "remaining": 99,
#   "limit": 100,
#   "reset_at": 1710100060
# }

Installation#

Python SDK#

bash

pip install cerberus-sdk

python

from cerberus import CerberusClient

client = CerberusClient(
    base_url="http://localhost:8000",
    api_key="cerb_abc123..."
)

# Check rate limit
result = client.check("user:42", "api-default")
if result.allowed:
    # Process request
    print(f"Remaining: {result.remaining}/{result.limit}")
else:
    # Rate limited — back off
    print(f"Retry after: {result.reset_at}")

TypeScript SDK#

bash

npm install @cerberus/sdk

typescript

import { Cerberus } from "@cerberus/sdk";

const cerberus = new Cerberus({
  baseUrl: "http://localhost:8000",
  apiKey: "cerb_abc123...",
});

const result = await cerberus.check("user:42", "api-default");

if (result.allowed) {
  console.log(`Remaining: ${result.remaining}/${result.limit}`);
} else {
  console.log(`Rate limited. Reset at: ${result.resetAt}`);
}

Rate Limiting#

Cerberus enforces rate limits using atomic Redis Lua scripts — no race conditions, no distributed locks, no approximations. Every decision completes in a single Redis round-trip.

Each check is identified by a composite key (e.g., user:42) and evaluated against a named policy. The response tells you whether the request is allowed and how many requests remain in the window.

Policies#

Policies define the rules for rate limiting. Each policy specifies an algorithm, a request limit, and a time window. Policies are scoped to your tenant — no cross-tenant leakage.

json

{
  "name": "api-default",
  "algorithm": "sliding_window",
  "limit": 100,
  "window_seconds": 60
}

// Token bucket example:
{
  "name": "burst-friendly",
  "algorithm": "token_bucket",
  "limit": 50,
  "window_seconds": 1,
  "burst_limit": 10
}

Algorithms#

Sliding Window#

The sliding window algorithm tracks requests across a rolling time window, providing smooth rate limiting without the "burst at boundary" problem of fixed windows. Cerberus implements this with a single Redis sorted set and an atomic Lua script.

Best for

APIs with steady traffic patterns

Complexity

O(log N) per check

Redis Structure

Sorted set (ZRANGEBYSCORE)

Boundary Issues

None

Token Bucket#

The token bucket algorithm allows controlled bursting. Tokens refill at a constant rate, and each request consumes one token. If the bucket is empty, the request is denied. The burst_limit parameter controls the maximum bucket size.

Best for

APIs allowing bursts (webhooks, uploads)

Complexity

O(1) per check

Redis Structure

Hash (HMSET)

Burst Support

Yes — configurable

Authentication#

All API requests require a valid API key passed in the X-API-Key header. Keys are prefixed with cerb_ for easy identification.

bash

# All requests must include your API key:
curl -H "X-API-Key: cerb_abc123..." \
  http://localhost:8000/v1/check

Note

API keys are hashed with SHA-256 before storage. Cerberus uses a prefix-based lookup (first 8 characters) for fast key resolution, then verifies the full hash.

Check Endpoint#

The core endpoint. Call it before processing any request to determine whether it should be allowed or rate-limited.

POST/v1/check

Request Body#

Parameter	Type	Description
`key`required	string	Unique identifier for the rate limit subject (e.g., user ID, IP)
`policy`required	string	Name of the rate limit policy to evaluate against
`cost`	integer	Number of tokens to consume (default: 1)

Response#

Parameter	Type	Description
`allowed`	boolean	Whether the request should be allowed
`remaining`	integer	Number of requests remaining in the current window
`limit`	integer	Total requests allowed per window
`reset_at`	integer	Unix timestamp when the window resets

json

// 200 OK — Allowed
{
  "allowed": true,
  "remaining": 97,
  "limit": 100,
  "reset_at": 1710100060
}

// 200 OK — Rate Limited
{
  "allowed": false,
  "remaining": 0,
  "limit": 100,
  "reset_at": 1710100060
}

Tip

The check endpoint always returns 200 OK. It's your application's responsibility to act on the allowed field. Return 429 Too Many Requests to your clients when allowed is false.

Analytics#

Cerberus tracks request metrics per tenant. Query the analytics endpoint to get insights into your rate limiting traffic.

GET/v1/analytics

json

{
  "total_checks": 1542893,
  "allowed": 1498210,
  "denied": 44683,
  "denial_rate": 0.029,
  "p99_latency_ms": 0.8,
  "policies": {
    "api-default": { "checks": 982100, "denied": 31200 },
    "auth-strict": { "checks": 560793, "denied": 13483 }
  }
}

Edit on GitHub