Database Backup & Disaster Recovery
The Ngwenya platform stores data across two database engines โ MongoDB for the federated subgraph services (Malets, products, Murchases, blogs, etc.) and Postgres for the Rust services (auth, uChat, SCIM). This document covers the backup infrastructure, storage backends, scheduling, and restore procedures.
Architecture Overview
graph TB
subgraph Databases
MONGO[(MongoDB<br/>ngwenya)]
PG_AUTH[(Postgres<br/>ngwenya_auth)]
PG_UCHAT[(Postgres<br/>ngwenya_uchat)]
PG_SCIM[(Postgres<br/>ngwenya_scim)]
end
subgraph "Backup Scripts"
BACKUP[backup-databases.sh]
RESTORE[restore-databases.sh]
end
subgraph "Storage Backends"
LOCAL[Local Filesystem<br/>./backups/]
S3[AWS S3]
R2[Cloudflare R2]
end
subgraph Scheduling
LAUNCHD[macOS launchd<br/>Daily 2 AM local]
CRON[GitHub Actions<br/>Daily 3 AM UTC]
MANUAL[make backup-db<br/>On-demand]
end
MONGO --> BACKUP
PG_AUTH --> BACKUP
PG_UCHAT --> BACKUP
PG_SCIM --> BACKUP
BACKUP --> LOCAL
BACKUP --> S3
BACKUP --> R2
LAUNCHD --> BACKUP
CRON --> BACKUP
MANUAL --> BACKUP
LOCAL --> RESTORE
S3 --> RESTORE
R2 --> RESTORE
What Gets Backed Up
| Database | Engine | Contents | Dump Tool |
|---|---|---|---|
ngwenya |
MongoDB | All collections โ Malets, products, services, Murchases, blogs, communities, collections, media metadata, search indexes, etc. | mongodump --gzip |
ngwenya_auth |
Postgres | Users, sessions, OAuth tokens, passkeys, MFA secrets, device fingerprints | pg_dump -Fc |
ngwenya_uchat |
Postgres | E2EE messages, conversations, participants, presence state | pg_dump -Fc |
ngwenya_scim |
Postgres | SCIM provisioning tokens, IdP configurations, SAML metadata | pg_dump -Fc |
Not backed up: Redis (ephemeral session/cache data), Meilisearch (rebuildable from MongoDB via
make search-reindex), MinIO/S3 media files (separate CDN backup strategy).
Makefile Targets
Manual Backup & Restore
make backup-db # Full backup (MongoDB + all Postgres)
make backup-db-mongo # MongoDB only
make backup-db-postgres # Postgres only
make restore-db # Restore from latest backup
make restore-db BACKUP=2026-05-07T15-20 # Restore specific timestamp
make restore-db-mongo # Restore MongoDB only
make restore-db-postgres # Restore Postgres only
Scheduling (macOS)
make backup-schedule-install # Install daily 2 AM launchd job
make backup-schedule-uninstall # Remove launchd job
make backup-schedule-status # Check scheduler status + last log
Storage Backends
Backups support pluggable storage via the BACKUP_STORAGE environment variable. Adding a new backend is a single function in scripts/backup-databases.sh:
Local (Default)
| Variable | Default | Description |
|---|---|---|
BACKUP_STORAGE |
local |
Storage backend selector |
BACKUP_LOCAL_DIR |
./backups |
Output directory |
BACKUP_RETENTION_DAYS |
7 |
Auto-prune older backups |
make backup-db
# โ backups/2026-05-07T15-36/mongo/ngwenya.archive.gz
# โ backups/2026-05-07T15-36/postgres/ngwenya_auth.dump
# โ backups/2026-05-07T15-36/postgres/ngwenya_uchat.dump
AWS S3
| Variable | Required | Description |
|---|---|---|
BACKUP_STORAGE |
โ | Set to s3 |
BACKUP_S3_BUCKET |
โ | S3 bucket name |
BACKUP_S3_REGION |
โ | Default: auto |
AWS_ACCESS_KEY_ID |
โ | IAM credentials |
AWS_SECRET_ACCESS_KEY |
โ | IAM credentials |
BACKUP_STORAGE=s3 BACKUP_S3_BUCKET=ngwenya-backups make backup-db
Cloudflare R2
| Variable | Required | Description |
|---|---|---|
BACKUP_STORAGE |
โ | Set to r2 |
BACKUP_S3_BUCKET |
โ | R2 bucket name |
BACKUP_S3_ENDPOINT |
โ | R2 S3-compatible endpoint |
AWS_ACCESS_KEY_ID |
โ | R2 API token (Access Key ID) |
AWS_SECRET_ACCESS_KEY |
โ | R2 API token (Secret) |
BACKUP_STORAGE=r2 \
BACKUP_S3_BUCKET=ngwenya-backups \
BACKUP_S3_ENDPOINT=https://xxxx.r2.cloudflarestorage.com \
make backup-db
Adding a New Backend
To add a new storage backend (e.g., GCS, Azure Blob):
- Add an
upload_gcs()function inscripts/backup-databases.sh - Add a
gcs)case to theupload()switch - Document the required env vars in
docs/environment-variables.md
Scheduling
Local Development (macOS launchd)
make backup-schedule-install
This creates a launchd plist at ~/Library/LaunchAgents/com.mallnline.ngwenya.backup.plist that runs backup-databases.sh daily at 2:00 AM local time.
| Setting | Value |
|---|---|
| Schedule | Daily at 2:00 AM local time |
| Logs | backups/logs/backup.log |
| Container engine | Uses CONTAINER_CMD (default: podman) |
| Requires | Database containers must be running |
To check if it's working:
make backup-schedule-status
To remove:
make backup-schedule-uninstall
CI/CD (GitHub Actions)
The backup.yaml workflow runs automated backups as part of the CI/CD pipeline:
| Setting | Value |
|---|---|
| Schedule | Daily at 3:00 AM UTC (0 3 * * *) |
| Manual trigger | workflow_dispatch from the Actions tab |
| Cloud storage | Via repo secrets: BACKUP_S3_BUCKET, BACKUP_AWS_ACCESS_KEY_ID |
| Local fallback | GitHub Actions artifacts (30-day retention) |
To change the cron schedule, edit the cron expression in .github/workflows/backup.yaml.
Restore Procedures
From Latest Local Backup
make restore-db
From Specific Backup
make restore-db BACKUP=2026-05-07T15-20
From Cloud Storage
Set the same BACKUP_STORAGE / BACKUP_S3_* env vars, then specify the backup timestamp:
BACKUP_STORAGE=s3 BACKUP_S3_BUCKET=ngwenya-backups \
make restore-db BACKUP=2026-05-07T15-20
The restore script will download the backup from S3/R2 to a temp directory, restore, and clean up.
What Restore Does
| Database | Restore Tool | Behavior |
|---|---|---|
| MongoDB | mongorestore --drop |
Drops existing collections first, then restores |
| Postgres | pg_restore --clean --if-exists |
Drops existing objects, creates if missing |
WARNING
Restore is destructive โ it replaces current database contents. Always verify you're restoring to the correct environment.
Integrity & Safety
The backup script includes built-in integrity checks:
- Non-empty verification: After backup, all dump files are checked for non-zero size. If any file is empty, the script exits with an error before pruning old backups.
- Retention guard: Old backups are only pruned after the new backup is verified.
- Database existence check: Postgres databases that don't exist are skipped gracefully (e.g.,
ngwenya_scimmay not exist in all environments).
Related
- CI/CD Pipeline & Production Infrastructure โ GitHub Actions pipeline architecture and backup scheduling
- Local Development Environment โ Podman setup, service containers, and database access
- Seed Infrastructure โ Populating databases with test data (22 archetypes)
- Staging Environment Architecture โ Pre-production environment for backup restore testing
- Environment Variables โ Full backup env var reference