Gateway Rate Limiting — Developer Guide

Overview

The Ngwenya API Gateway enforces a tiered, endpoint-aware rate limiting system that protects the platform from abuse while providing fair throughput to authenticated Visitors, Buyers, and Malet Owners.

Three layers of protection work in concert:

Layer	Protection	Default
Burst window	Catches rapid-fire scripting	20 req / 10s
Sustained window	Prevents slow-drip abuse	200 req / 60s
Query complexity	Rejects expensive GraphQL queries	Max cost 500, depth 10

Both rate limit windows must be satisfied for a request to pass. Query complexity is evaluated independently after the rate check.

Architecture

The rate limiting system is implemented in the apps/ngwenya-gateway/src/rate-limit/ module and consists of five files:

File	Purpose
`rate-limit.module.ts`	Module registration with dual-window `ThrottlerModule` config
`rate-limit.constants.ts`	Central defaults, env var keys, and preset profiles
`gql-throttler.guard.ts`	Custom `ThrottlerGuard` with auth uplift and mutation penalties
`endpoint-throttle.decorator.ts`	`@EndpointThrottle()` for per-endpoint overrides
`throttler-redis-storage.ts`	Redis-backed `ThrottlerStorage` adapter

Request Flow

graph TD
    A["Incoming Request"] --> B{"@SkipThrottle?"}
    B -->|Yes| C["✅ Pass Through"]
    B -->|No| D{"@EndpointThrottle?"}
    D -->|Yes| E["Use endpoint limits"]
    D -->|No| F["Use global defaults"]
    E --> G{"Authenticated?"}
    F --> G
    G -->|Yes| H["limit × 2.0"]
    G -->|No| I["keep base limit"]
    H --> J{"GraphQL Mutation?"}
    I --> J
    J -->|Yes| K["limit × 0.5"]
    J -->|No| L["keep current limit"]
    K --> M{"Check SHORT window"}
    L --> M
    M -->|Pass| N{"Check LONG window"}
    M -->|Fail| O["❌ 429 burst window"]
    N -->|Pass| P{"Query Complexity Check"}
    N -->|Fail| Q["❌ 429 sustained window"]
    P -->|Pass| R["✅ Resolve"]
    P -->|Fail| S["❌ QUERY_TOO_COMPLEX"]

    style C fill:#22c55e,color:#fff
    style R fill:#22c55e,color:#fff
    style O fill:#ef4444,color:#fff
    style Q fill:#ef4444,color:#fff
    style S fill:#ef4444,color:#fff

Dual-Window Throttling

Instead of a single rate limit window, the gateway enforces two concurrent windows:

Short (Burst) Window

{
  name: 'short',
  ttl: 10_000,   // 10 seconds
  limit: 20,     // 20 requests per window
}

This catches rapid-fire automated requests — bots, scrapers, or brute-force scripts that fire dozens of requests per second.

Long (Sustained) Window

{
  name: 'long',
  ttl: 60_000,   // 60 seconds
  limit: 200,    // 200 requests per window
}

This catches slower abuse patterns that would slip under the burst window but are still abnormally high when measured over a minute.

Both windows must pass. A request that fits within the burst budget but has exceeded the sustained budget will still be rejected.

Dynamic Limit Adjustments

The GqlThrottlerGuard dynamically adjusts limits based on context:

Authenticated User Uplift

Authenticated users (Visitors, Buyers, and Malet Owners) have a verified identity, reducing abuse risk. Their limits are multiplied by a configurable factor:

effectiveLimit = baseLimit × AUTH_MULTIPLIER

Default: 2× — an authenticated Buyer gets 40 burst / 400 sustained vs. 20/200 for anonymous traffic.

Mutation Penalty

GraphQL mutations (state-changing operations like createMurchase, updateProduct, addToUCart) receive a stricter limit than queries:

effectiveLimit = effectiveLimit × MUTATION_FACTOR

Default: 0.5× — mutations are limited to half the query budget since they cause writes.

Combined Effect

User Type	Operation	Burst Limit	Sustained Limit
Anonymous	Query	20	200
Anonymous	Mutation	10	100
Authenticated	Query	40	400
Authenticated	Mutation	20	200

Per-Endpoint Overrides

`@EndpointThrottle` Decorator

Apply stricter or relaxed limits to specific REST or GraphQL controller endpoints:

import { EndpointThrottle, RATE_LIMIT_PRESETS } from '../rate-limit';

@Controller('auth')
export class AuthController {
  // Strict limits for login — prevents brute-force
  @EndpointThrottle(RATE_LIMIT_PRESETS.STRICT_AUTH)
  @Post('login')
  login() { ... }

  // Custom inline limits
  @EndpointThrottle({
    short: { limit: 3, ttl: 10_000 },
    long: { limit: 10, ttl: 60_000 },
  })
  @Post('reset-password')
  resetPassword() { ... }
}

Preset Profiles

Three presets are available in rate-limit.constants.ts:

Preset	Burst	Sustained	Use Case
`STRICT_AUTH`	5 / 10s	15 / 60s	Login, OTP, passkey — brute-force prevention
`RELAXED_PUBLIC`	50 / 10s	500 / 60s	Public read-only endpoints
`DEFAULT`	20 / 10s	200 / 60s	Standard API endpoints

`@SkipThrottle` Exemption

Fully exempt endpoints from rate limiting:

import { SkipThrottle } from '@nestjs/throttler';

// Must specify both named throttlers
@SkipThrottle({ short: true, long: true })
@Controller()
export class SitemapController { ... }

Important: Because the gateway uses named throttlers (short and long), a bare @SkipThrottle() with no arguments only skips the default-named throttler (which doesn't exist). Always pass { short: true, long: true }.

Tracker Keys

Rate limits are tracked per-client using a priority-based key:

// 1. Authenticated user — tracked by userId
'user:abc-123-def';

// 2. Anonymous — tracked by IP (proxy-aware)
'ip:203.0.113.50'; // x-forwarded-for (first hop)
'ip:192.168.1.1'; // x-real-ip fallback
'ip:unknown'; // last resort

This means an authenticated Buyer hitting the API from multiple IPs is tracked as a single identity, while anonymous traffic is tracked per-IP.

Redis-Backed Storage

For production deployments with multiple gateway instances, rate limit state must be shared. The ThrottlerRedisStorage adapter wraps the platform's existing RedisService (from @app/common):

// Key format in Redis
'throttle:{tracker}:{throttlerName}';

// Example
'throttle:user:abc-123:short'; // burst window counter
'throttle:user:abc-123:long'; // sustained window counter

No new dependencies — the adapter uses the existing ioredis client from RedisService, which is already wired for session management, idempotency, and caching.

Development Mode

In development, the in-memory throttler storage works fine for a single gateway process. Redis is only required for multi-instance production.

Environment Variables

All rate limiting behaviour is configurable via environment variables with sensible defaults:

Variable	Default	Description
`RATE_LIMIT_SHORT_TTL`	`10000`	Burst window duration in ms
`RATE_LIMIT_SHORT_MAX`	`20`	Max requests per burst window
`RATE_LIMIT_LONG_TTL`	`60000`	Sustained window duration in ms
`RATE_LIMIT_LONG_MAX`	`200`	Max requests per sustained window
`RATE_LIMIT_AUTH_MULTIPLIER`	`2`	Limit multiplier for authenticated users
`RATE_LIMIT_MUTATION_FACTOR`	`0.5`	Limit factor for GraphQL mutations (0.0–1.0)
`QUERY_MAX_COMPLEXITY`	`500`	Max GraphQL query cost
`QUERY_MAX_DEPTH`	`10`	Max GraphQL query nesting depth

Error Responses

When a rate limit is exceeded, the gateway returns a 429 Too Many Requests with a descriptive message:

{
	"statusCode": 429,
	"message": "Rate limit exceeded (burst window). Please slow down your requests."
}

The message identifies which window was hit (burst vs sustained) to help Buyers and integrators diagnose the issue.

Response Headers

Every response includes per-window rate limit headers:

X-RateLimit-Limit-Short: 20
X-RateLimit-Remaining-Short: 14
X-RateLimit-Reset-Short: 10
X-RateLimit-Limit-Long: 200
X-RateLimit-Remaining-Long: 187
X-RateLimit-Reset-Long: 60

Frontend Integration

When the frontend receives a 429 response, it should display a user-friendly toast or message rather than exposing the raw error. The X-RateLimit-Remaining-* headers can be used for proactive throttling in the client:

// Example: Check remaining budget before submitting
const response = await fetch('/graphql', { ... });
const remaining = parseInt(response.headers.get('X-RateLimit-Remaining-Short') || '999');

if (remaining < 3) {
  showToast('You\'re making requests too quickly. Please wait a moment.');
}

Exempt Endpoints

The following endpoints are fully exempt from rate limiting:

Endpoint	Reason
`GET /sitemap.xml`	Idempotent XML, no auth surface
`GET /sitemap-malets.xml`	Idempotent XML, no auth surface
`GET /sitemap-blogs.xml`	Idempotent XML, no auth surface

Testing

Unit Tests

# Run all gateway unit tests (86 tests)
npm run test -- apps/ngwenya-gateway --forceExit

Covers: guard logic, auth uplift, mutation detection, decorator metadata, Redis storage increment/TTL/blocking.

E2E Tests

# Run gateway E2E tests (13 tests)
npx jest --config apps/ngwenya-gateway/test/jest-e2e.json --forceExit

Covers: dual-window enforcement, @SkipThrottle exemptions, @EndpointThrottle overrides, error messages.

Gateway (Hive Gateway) — Federated gateway architecture, composition, benchmarks, and migration details
Gateway Tracing & Observability — Companion gateway feature — response caching, APQ, per-operation metrics
Debugging & Testing — curl workflows, auth pipeline, and order consistency checks
Search Engine Administration — Admin operations that use MANAGE_SEARCH permission and benefit from rate-limit uplift
uCart Concurrency & Resilience — Optimistic concurrency control that complements rate limiting for high-contention cart mutations