Gateway Rate Limiting โ Developer Guide
Overview
The Ngwenya API Gateway enforces a tiered, endpoint-aware rate limiting system that protects the platform from abuse while providing fair throughput to authenticated Visitors, Buyers, and Malet Owners.
Three layers of protection work in concert:
| Layer | Protection | Default |
|---|---|---|
| Burst window | Catches rapid-fire scripting | 20 req / 10s |
| Sustained window | Prevents slow-drip abuse | 200 req / 60s |
| Query complexity | Rejects expensive GraphQL queries | Max cost 500, depth 10 |
Both rate limit windows must be satisfied for a request to pass. Query complexity is evaluated independently after the rate check.
Architecture
The rate limiting system is implemented in the apps/ngwenya-gateway/src/rate-limit/ module and consists of five files:
| File | Purpose |
|---|---|
rate-limit.module.ts |
Module registration with dual-window ThrottlerModule config |
rate-limit.constants.ts |
Central defaults, env var keys, and preset profiles |
gql-throttler.guard.ts |
Custom ThrottlerGuard with auth uplift and mutation penalties |
endpoint-throttle.decorator.ts |
@EndpointThrottle() for per-endpoint overrides |
throttler-redis-storage.ts |
Redis-backed ThrottlerStorage adapter |
Request Flow
graph TD
A["Incoming Request"] --> B{"@SkipThrottle?"}
B -->|Yes| C["โ
Pass Through"]
B -->|No| D{"@EndpointThrottle?"}
D -->|Yes| E["Use endpoint limits"]
D -->|No| F["Use global defaults"]
E --> G{"Authenticated?"}
F --> G
G -->|Yes| H["limit ร 2.0"]
G -->|No| I["keep base limit"]
H --> J{"GraphQL Mutation?"}
I --> J
J -->|Yes| K["limit ร 0.5"]
J -->|No| L["keep current limit"]
K --> M{"Check SHORT window"}
L --> M
M -->|Pass| N{"Check LONG window"}
M -->|Fail| O["โ 429 burst window"]
N -->|Pass| P{"Query Complexity Check"}
N -->|Fail| Q["โ 429 sustained window"]
P -->|Pass| R["โ
Resolve"]
P -->|Fail| S["โ QUERY_TOO_COMPLEX"]
style C fill:#22c55e,color:#fff
style R fill:#22c55e,color:#fff
style O fill:#ef4444,color:#fff
style Q fill:#ef4444,color:#fff
style S fill:#ef4444,color:#fff
Dual-Window Throttling
Instead of a single rate limit window, the gateway enforces two concurrent windows:
Short (Burst) Window
{
name: 'short',
ttl: 10_000, // 10 seconds
limit: 20, // 20 requests per window
}
This catches rapid-fire automated requests โ bots, scrapers, or brute-force scripts that fire dozens of requests per second.
Long (Sustained) Window
{
name: 'long',
ttl: 60_000, // 60 seconds
limit: 200, // 200 requests per window
}
This catches slower abuse patterns that would slip under the burst window but are still abnormally high when measured over a minute.
Both windows must pass. A request that fits within the burst budget but has exceeded the sustained budget will still be rejected.
Dynamic Limit Adjustments
The GqlThrottlerGuard dynamically adjusts limits based on context:
Authenticated User Uplift
Authenticated users (Visitors, Buyers, and Malet Owners) have a verified identity, reducing abuse risk. Their limits are multiplied by a configurable factor:
effectiveLimit = baseLimit ร AUTH_MULTIPLIER
Default: 2ร โ an authenticated Buyer gets 40 burst / 400 sustained vs. 20/200 for anonymous traffic.
Mutation Penalty
GraphQL mutations (state-changing operations like createMurchase, updateProduct, addToUCart) receive a stricter limit than queries:
effectiveLimit = effectiveLimit ร MUTATION_FACTOR
Default: 0.5ร โ mutations are limited to half the query budget since they cause writes.
Combined Effect
| User Type | Operation | Burst Limit | Sustained Limit |
|---|---|---|---|
| Anonymous | Query | 20 | 200 |
| Anonymous | Mutation | 10 | 100 |
| Authenticated | Query | 40 | 400 |
| Authenticated | Mutation | 20 | 200 |
Per-Endpoint Overrides
`@EndpointThrottle` Decorator
Apply stricter or relaxed limits to specific REST or GraphQL controller endpoints:
import { EndpointThrottle, RATE_LIMIT_PRESETS } from '../rate-limit';
@Controller('auth')
export class AuthController {
// Strict limits for login โ prevents brute-force
@EndpointThrottle(RATE_LIMIT_PRESETS.STRICT_AUTH)
@Post('login')
login() { ... }
// Custom inline limits
@EndpointThrottle({
short: { limit: 3, ttl: 10_000 },
long: { limit: 10, ttl: 60_000 },
})
@Post('reset-password')
resetPassword() { ... }
}
Preset Profiles
Three presets are available in rate-limit.constants.ts:
| Preset | Burst | Sustained | Use Case |
|---|---|---|---|
STRICT_AUTH |
5 / 10s | 15 / 60s | Login, OTP, passkey โ brute-force prevention |
RELAXED_PUBLIC |
50 / 10s | 500 / 60s | Public read-only endpoints |
DEFAULT |
20 / 10s | 200 / 60s | Standard API endpoints |
`@SkipThrottle` Exemption
Fully exempt endpoints from rate limiting:
import { SkipThrottle } from '@nestjs/throttler';
// Must specify both named throttlers
@SkipThrottle({ short: true, long: true })
@Controller()
export class SitemapController { ... }
Important: Because the gateway uses named throttlers (
shortandlong), a bare@SkipThrottle()with no arguments only skips thedefault-named throttler (which doesn't exist). Always pass{ short: true, long: true }.
Tracker Keys
Rate limits are tracked per-client using a priority-based key:
// 1. Authenticated user โ tracked by userId
'user:abc-123-def';
// 2. Anonymous โ tracked by IP (proxy-aware)
'ip:203.0.113.50'; // x-forwarded-for (first hop)
'ip:192.168.1.1'; // x-real-ip fallback
'ip:unknown'; // last resort
This means an authenticated Buyer hitting the API from multiple IPs is tracked as a single identity, while anonymous traffic is tracked per-IP.
Redis-Backed Storage
For production deployments with multiple gateway instances, rate limit state must be shared. The ThrottlerRedisStorage adapter wraps the platform's existing RedisService (from @app/common):
// Key format in Redis
'throttle:{tracker}:{throttlerName}';
// Example
'throttle:user:abc-123:short'; // burst window counter
'throttle:user:abc-123:long'; // sustained window counter
No new dependencies โ the adapter uses the existing ioredis client from RedisService, which is already wired for session management, idempotency, and caching.
Development Mode
In development, the in-memory throttler storage works fine for a single gateway process. Redis is only required for multi-instance production.
Environment Variables
All rate limiting behaviour is configurable via environment variables with sensible defaults:
| Variable | Default | Description |
|---|---|---|
RATE_LIMIT_SHORT_TTL |
10000 |
Burst window duration in ms |
RATE_LIMIT_SHORT_MAX |
20 |
Max requests per burst window |
RATE_LIMIT_LONG_TTL |
60000 |
Sustained window duration in ms |
RATE_LIMIT_LONG_MAX |
200 |
Max requests per sustained window |
RATE_LIMIT_AUTH_MULTIPLIER |
2 |
Limit multiplier for authenticated users |
RATE_LIMIT_MUTATION_FACTOR |
0.5 |
Limit factor for GraphQL mutations (0.0โ1.0) |
QUERY_MAX_COMPLEXITY |
500 |
Max GraphQL query cost |
QUERY_MAX_DEPTH |
10 |
Max GraphQL query nesting depth |
Error Responses
When a rate limit is exceeded, the gateway returns a 429 Too Many Requests with a descriptive message:
{
"statusCode": 429,
"message": "Rate limit exceeded (burst window). Please slow down your requests."
}
The message identifies which window was hit (burst vs sustained) to help Buyers and integrators diagnose the issue.
Response Headers
Every response includes per-window rate limit headers:
X-RateLimit-Limit-Short: 20
X-RateLimit-Remaining-Short: 14
X-RateLimit-Reset-Short: 10
X-RateLimit-Limit-Long: 200
X-RateLimit-Remaining-Long: 187
X-RateLimit-Reset-Long: 60
Frontend Integration
When the frontend receives a 429 response, it should display a user-friendly toast or message rather than exposing the raw error. The X-RateLimit-Remaining-* headers can be used for proactive throttling in the client:
// Example: Check remaining budget before submitting
const response = await fetch('/graphql', { ... });
const remaining = parseInt(response.headers.get('X-RateLimit-Remaining-Short') || '999');
if (remaining < 3) {
showToast('You\'re making requests too quickly. Please wait a moment.');
}
Exempt Endpoints
The following endpoints are fully exempt from rate limiting:
| Endpoint | Reason |
|---|---|
GET /sitemap.xml |
Idempotent XML, no auth surface |
GET /sitemap-malets.xml |
Idempotent XML, no auth surface |
GET /sitemap-blogs.xml |
Idempotent XML, no auth surface |
Testing
Unit Tests
# Run all gateway unit tests (86 tests)
npm run test -- apps/ngwenya-gateway --forceExit
Covers: guard logic, auth uplift, mutation detection, decorator metadata, Redis storage increment/TTL/blocking.
E2E Tests
# Run gateway E2E tests (13 tests)
npx jest --config apps/ngwenya-gateway/test/jest-e2e.json --forceExit
Covers: dual-window enforcement, @SkipThrottle exemptions, @EndpointThrottle overrides, error messages.
Related
- Gateway (Hive Gateway) โ Federated gateway architecture, composition, benchmarks, and migration details
- Gateway Tracing & Observability โ Companion gateway feature โ response caching, APQ, per-operation metrics
- Debugging & Testing โ curl workflows, auth pipeline, and order consistency checks
- Search Engine Administration โ Admin operations that use
MANAGE_SEARCHpermission and benefit from rate-limit uplift - uCart Concurrency & Resilience โ Optimistic concurrency control that complements rate limiting for high-contention cart mutations