Media Infrastructure โ Developer Guide
Overview
The Ngwenya platform provides a two-service media architecture โ a core media service for uploads, image processing, and storage abstraction, and a dedicated video service for FFmpeg-based transcoding. Together, these services power all product imagery, Malet branding, and video content on the platform.
| Service | Port | Purpose | Key Libraries |
|---|---|---|---|
media |
TCP microservice | Upload, storage, image optimization, CDN rewriting | sharp, exifr, @aws-sdk/client-s3 |
video |
TCP:30021 | HLS transcoding, poster generation, metadata extraction | fluent-ffmpeg, @ffmpeg-installer/ffmpeg |
imaging |
TCP microservice | AI background removal, shadow generation | ONNX Runtime, rembg model |
Architecture
graph TD
A["Upload Request"] --> B["MediaService"]
B --> C{"Image?"}
C -->|Yes| D["ImageProcessorService"]
C -->|No| G{"Video?"}
D --> D1["sharp: WebP conversion"]
D --> D2["sharp: 3 Thumbnails (parallel)"]
D --> D3["sharp: Blur Placeholder (LQIP)"]
D --> D4["exifr: EXIF Extraction"]
D --> D5{"AI Enhance?"}
D5 -->|Yes| D6["ImageEnhancementService"]
D5 -->|No| E
D1 --> E["StorageProvider.upload()"]
G -->|Yes| H["VideoTranscodeService (TCP)"]
G -->|No| E
E --> F["CdnService.rewriteUrl()"]
F --> I[("MongoDB: media_files")]
H --> H1["FFmpeg: HLS + poster"]
H1 --> E
B --> J{"SHA-256 match?"}
J -->|Duplicate| K["Clone record, skip reprocessing"]
J -->|New| C
style B fill:#22c55e,color:#fff
style D fill:#3b82f6,color:#fff
style H fill:#f59e0b,color:#fff
style K fill:#8b5cf6,color:#fff
Upload Pipeline
Two Upload Modes
The platform supports two upload flows to accommodate different client needs:
| Flow | How It Works | Use Case |
|---|---|---|
| Direct Upload | Client sends file โ POST /media/upload โ server processes & stores |
Small files, admin tools |
| Presigned URL | Client calls requestUploadUrl โ uploads direct to S3 โ calls confirmUpload |
Large files, frontend product editor |
Direct Upload Flow
Client โ POST /media/upload (multipart)
โโโ 1. SHA-256 hash โ check for duplicate
โ โโโ Duplicate? Clone record & return (skip processing)
โโโ 2. Is image? โ ImageProcessorService pipeline
โ โโโ Auto-rotate (EXIF orientation)
โ โโโ Convert to WebP (quality: 80)
โ โโโ Generate 3 thumbnails (150px, 300px, 600px)
โ โโโ Generate blur placeholder (20px, base64 data URI)
โ โโโ Extract EXIF metadata (GPS, camera, date)
โ โโโ Optional: AI background removal
โโโ 3. Upload to S3/R2 via StorageProvider
โโโ 4. Rewrite URLs through CdnService
โโโ 5. Save MediaFile to MongoDB (status: COMPLETE)
Presigned URL Flow (confirmUpload)
// 1. Frontend requests a presigned URL
const { uploadUrl, fileId } = await requestUploadUrl({
filename: 'product-photo.jpg',
mimetype: 'image/jpeg',
maletId: 'malet-123'
});
// 2. Frontend uploads directly to S3
await fetch(uploadUrl, { method: 'PUT', body: file });
// 3. Frontend confirms the upload โ triggers server-side processing
const mediaFile = await confirmUpload({
fileId,
removeBackground: true // optional AI enhancement
});
The confirmUpload mutation validates the file exists in S3, downloads it, runs the full image pipeline (WebP, thumbnails, blur, EXIF), and re-uploads the optimized versions.
Image Processing
All image processing uses the sharp library for high-performance, zero-copy operations.
Thumbnail Generation
Three sizes are generated in parallel for every image upload:
| Label | Max Dimension | Quality | Format | Use Case |
|---|---|---|---|---|
small |
150px | 70% | WebP | Grid thumbnails, search results |
medium |
300px | 70% | WebP | Product cards, carousels |
large |
600px | 70% | WebP | Product detail, lightbox preview |
const DEFAULT_THUMBNAIL_SIZES = [
{ label: 'small', maxDimension: 150 },
{ label: 'medium', maxDimension: 300 },
{ label: 'large', maxDimension: 600 }
];
Blur Placeholder (LQIP)
Every image generates a Low Quality Image Placeholder โ a tiny 20px-wide WebP encoded as a base64 data URI. The frontend uses this for instant-loading skeleton shimmer effects while the full image loads.
// Stored on the MediaFile document as `blurDataUrl`
// "data:image/webp;base64,UklGRvQAAABXRUJQ..."
EXIF Extraction
The exifr library extracts embedded metadata before sharp strips it during optimization:
| EXIF Field | Example | Use Case |
|---|---|---|
| GPS coordinates | { latitude: -33.86, longitude: 151.20 } |
Geo-tagging product origins |
| Camera model | "Canon EOS R5" |
Photography vertical attribution |
| Date taken | "2026-03-15T10:30:00" |
Chronological gallery sorting |
| Lens info | "RF 24-70mm F2.8L" |
Photography portfolio display |
const exif = await this.imageProcessorService.extractExif(buffer);
// { Make: 'Canon', Model: 'EOS R5', GPSLatitude: -33.86, ... }
SHA-256 Deduplication
Every file upload computes a SHA-256 hash of the raw buffer. If an identical hash already exists within the same Malet, the system clones the metadata record instead of re-processing and re-uploading.
How It Works
const hash = createHash('sha256').update(file.buffer).digest('hex');
const existing = await this.mediaFileModel.findOne({
hash,
maletId,
status: MediaStatus.COMPLETE
});
if (existing) {
// Clone the record โ same S3 key, same thumbnails, new document ID
return this.mediaFileModel.create({ ...existing, id: nanoid() });
}
Reference-Counted Deletion
When deleting a file, the system checks how many MediaFile documents reference the same S3 key:
- Last reference (
refCount <= 1) โ Delete from S3 + all thumbnails - Other references remain โ Delete the MongoDB document only, S3 objects stay
This prevents orphaned S3 objects while ensuring deduplication doesn't cause data loss.
Storage Abstraction
The media service uses a provider pattern for storage abstraction, supporting both platform defaults and custom Malet storage.
Provider Interface
interface StorageProvider {
upload(file: Express.Multer.File, key: string): Promise<string>;
delete(key: string): Promise<void>;
exists(key: string): Promise<boolean>;
download(key: string): Promise<Buffer>;
}
Provider Resolution
upload(file, maletId?)
โโโ No maletId โ System S3 provider (default bucket)
โโโ maletId provided
โโโ Malet has custom storage config โ DynamicStorageFactory
โโโ No custom config โ System S3 provider (fallback)
R2 CORS Initialization
On service startup, the media service automatically configures CORS on the S3/R2 bucket:
// Runs on NestJS module init
await s3.send(
new PutBucketCorsCommand({
Bucket: 'ngwenya-media',
CORSConfiguration: {
CORSRules: [
{
AllowedOrigins: [process.env.FRONTEND_URL],
AllowedMethods: ['GET', 'PUT'],
AllowedHeaders: ['*'],
MaxAgeSeconds: 3600
}
]
}
})
);
This ensures presigned URL uploads from the browser are never blocked by CORS, even after bucket recreation or environment changes.
CDN URL Rewriting
The CdnService transparently rewrites storage URLs (MinIO, S3, R2) to CDN URLs (Cloudflare) when enabled.
| Environment | URL Pattern |
|---|---|
| Development | http://localhost:9000/ngwenya-media/uploads/abc.webp |
| Production | https://cdn.mallnline.com/ngwenya-media/uploads/abc.webp |
Configuration
CDN_ENABLED=true
CDN_BASE_URL=https://cdn.mallnline.com
CDN_ORIGIN_PATTERNS=localhost:9000,s3.us-east-1.amazonaws.com
When CDN_ENABLED=false (default in development), all URLs pass through unchanged.
Video Transcoding
The apps/video service is a dedicated microservice (TCP:30021) that handles FFmpeg-based video processing. It mirrors the apps/imaging architecture โ intentionally decoupled from apps/media to keep the media service lightweight.
HLS Transcoding
Source video (MP4/MOV) โ FFmpeg โ HLS segments (.ts) + playlist (.m3u8) + poster (.jpg)
| Output | Format | Details |
|---|---|---|
| Playlist | .m3u8 |
VOD playlist, 6-second segments |
| Segments | .ts |
H.264 video + AAC audio, CRF 23, fast preset |
| Poster | .jpg |
1280ร720 screenshot at 1 second |
TCP Microservice Contract
// From apps/media โ fire and forget
await this.videoTranscodeService.initiate(fileId, s3Key, maletId);
// apps/video receives via TCP MessagePattern
@MessagePattern('video.transcode')
async handleTranscode(payload: TranscodePayload): Promise<TranscodeResult>
Future: This NestJS implementation is a reference implementation. A Rust rewrite is planned for production scale, using
ffmpeg-sysbindings for lower memory overhead. The TCP contract ensures the Rust binary is a drop-in replacement.
Environment Variables
| Variable | Default | Description |
|---|---|---|
S3_ENDPOINT |
http://localhost:9000 |
S3-compatible storage endpoint |
S3_REGION |
us-east-1 |
Storage region |
S3_BUCKET |
ngwenya-media |
Default storage bucket |
S3_ACCESS_KEY |
โ | Required. Storage access key |
S3_SECRET_KEY |
โ | Required. Storage secret key |
S3_PUBLIC_URL |
http://localhost:9000 |
Public URL prefix for stored objects |
CDN_ENABLED |
false |
Toggle CDN URL rewriting |
CDN_BASE_URL |
โ | CDN base URL for rewriting |
CDN_ORIGIN_PATTERNS |
localhost:9000 |
Comma-separated storage origins to rewrite |
FRONTEND_URL |
โ | Used for automated CORS configuration |
Module Structure
apps/media/src/
โโโ media/
โ โโโ media.service.ts # Upload, dedup, confirm flow
โ โโโ media.resolver.ts # GraphQL mutations/queries
โ โโโ media.controller.ts # REST upload endpoint
โ โโโ image-processor.service.ts # sharp: WebP, thumbnails, blur, EXIF
โ โโโ malet-config.service.ts # Custom storage config resolution
โ โโโ upload-cleanup.service.ts # Orphan cleanup
โ โโโ models/media-file.model.ts # MediaFile entity
โ โโโ providers/
โ โโโ storage-provider.interface.ts
โ โโโ s3-storage.provider.ts
โ โโโ dynamic-storage.factory.ts
โโโ cdn/
โ โโโ cdn.service.ts # URL rewriting
โโโ image-enhancement/ # AI background removal
โโโ gallery/ # Gallery management
โโโ video-transcode/ # TCP client to apps/video
apps/video/src/
โโโ video/
โโโ video.service.ts # FFmpeg HLS transcoding
โโโ video.controller.ts # TCP message handler
Testing
Unit Tests
# Media service (includes image processor, CDN, dedup, gallery)
npm run test -- apps/media --no-coverage
# Video service
npm run test -- apps/video --no-coverage
E2E Tests
npx jest --config apps/media/test/jest-e2e.json --forceExit
Related
- Media Intelligence & Processing โ EXIF extraction, SHA-256 deduplication, and video transcoding
- Media Analytics API โ Upload volume, storage usage, and type distribution analytics
- Soft-Delete & Trash System โ Data lifecycle management for catalog items that reference media