Rate Limiting
This document details the standard rate limiting implementation, including smart key functions, tiers, and usage patterns.
Smart Rate Limiting Architecture
Conditional Key Function
The system uses an intelligent key function in utils/rate_limiter.py to automatically determine the rate limit key based on authentication status.
Security requirement: rate limiting must never trust unverified JWT payloads. If you key on user_id, that user_id must come from a verified authentication step (signature verified, expiry checked, etc.), otherwise an attacker can forge a token payload and evade IP-based throttles.
from fastapi import Request
from slowapi.util import get_remote_address
def get_rate_limit_key(request: Request) -> str:
"""
Automatically determine rate limit key based on authentication.
- Authenticated requests: Limited by user_id
- Public requests: Limited by IP address
"""
# Preferred: rely on verified auth attaching an identity to request.state.
# Example: your get_current_user dependency (or auth middleware) sets:
# request.state.authenticated_user_id = "<verified user id>"
user_id = getattr(request.state, "authenticated_user_id", None)
if user_id:
return f"user:{user_id}"
# Fall back to IP for public/unauthenticated requests
return f"ip:{get_remote_address(request)}"
Alternative: verify JWT before extracting user_id (only if you cannot use request.state)
If you truly cannot attach the verified identity onto request.state, you may verify the token inside the key function with signature verification enabled and strict failure handling. Avoid turning the key function into a second full auth system—prefer the request.state approach above.
import os
from fastapi import Request
from jose import JWTError, jwt
from slowapi.util import get_remote_address
SECRET_KEY = os.getenv("SECRET_KEY", "")
ALGORITHMS = ["HS256"]
def get_rate_limit_key(request: Request) -> str:
auth_header = request.headers.get("Authorization", "")
if auth_header.startswith("Bearer "):
token = auth_header.split(" ", 1)[1].strip()
if token and SECRET_KEY:
try:
payload = jwt.decode(token, SECRET_KEY, algorithms=ALGORITHMS)
user_id = payload.get("user_id")
if user_id:
return f"user:{user_id}"
except JWTError:
# Invalid/expired tokens should not grant user-based limits.
pass
return f"ip:{get_remote_address(request)}"
Rate Limit Tiers
Public Endpoints (IP-Based)
Authentication/Registration - 10 requests/minute:
- User registration
- Login
- Password reset
Public Forms - 5 requests/minute:
- Support tickets
- Feedback forms
- Issue reports
High-Traffic Public endpoints - 20 requests/minute:
- Public configuration fetch
- Widget initialization
Authenticated Endpoints (User-Based)
Standard CRUD Operations - 100 requests/minute:
- All create, read, update, delete operations
Real-Time/Streaming - 120 requests/minute:
- Chat/Streaming endpoints (higher limit for real-time feel)
Search/Autocomplete - 200 requests/minute:
- Search endpoints
- Autocomplete lookups
Admin/Sensitive - 30 requests/minute:
- API key management
- Admin operations
- Payment operations
- Security admin endpoints
File Operations - 20 requests/minute:
- File uploads
- Scans
Analytics - 60 requests/minute:
- Statistics endpoints
Webhooks - 100 requests/minute:
- Integration endpoints
Usage
Apply Rate Limit to Endpoint
from fastapi import APIRouter
from slowapi import Limiter
from utils.rate_limiter import get_rate_limit_key
router = APIRouter()
limiter = Limiter(key_func=get_rate_limit_key)
@router.post("/endpoint")
@limiter.limit("100/minute")
async def endpoint(token: dict = Depends(get_current_user)):
"""
Automatically applies:
- User-based limit if authenticated (user:abc123)
- IP-based limit if public (ip:192.168.1.1)
"""
return {"status": "ok"}
Different Limits for Different Endpoints
@router.post("/public-endpoint")
@limiter.limit("10/minute") # Stricter for public
async def public_endpoint():
pass
@router.get("/authenticated-endpoint")
@limiter.limit("100/minute") # More lenient for authenticated
async def auth_endpoint(token: dict = Depends(get_current_user)):
pass
@router.get("/search-endpoint")
@limiter.limit("200/minute") # Higher for search/autocomplete
async def search_endpoint(token: dict = Depends(get_current_user)):
pass
Environment-Aware Configuration
Rate limiting should be configured to be:
- DISABLED in
localandtestenvironments (for development and automated testing) - ENABLED in
developmentandproductionenvironments - OVERRIDDEN via env var
DISABLE_RATE_LIMITER=1(e.g., for scheduled internal tasks)
Configuration pattern:
import os
if os.getenv("DISABLE_RATE_LIMITER") == "1":
limiter = Limiter(key_func=get_rate_limit_key, enabled=False)
else:
ENVIRONMENT = os.getenv("ENVIRONMENT", "local")
limiter = Limiter(
key_func=get_rate_limit_key,
enabled=ENVIRONMENT not in ["local", "test"],
)
Benefits Over Simple Rate Limiting
- Fair Resource Distribution: User-based limits ensure one user can't consume all resources, even if multiple users are behind the same IP.
- Shared IP Protection: Corporate users behind NAT won't block each other since they're tracked by user_id, not IP.
- Account Accountability: Can identify and handle abusive user accounts.
- Credential Sharing Prevention: Multiple people using same credentials will hit user limits faster, discouraging sharing.
- Flexible Limits: Public endpoints have stricter limits than authenticated ones.
- Automatic Detection: No need to maintain separate limiters for different endpoint types.
Response When Rate Limited
HTTP Status: 429 Too Many Requests
Response Body:
{
"error": "Rate limit exceeded"
}
Headers:
X-RateLimit-Limit: 100
X-RateLimit-Remaining: 0
X-RateLimit-Reset: 1704412800