Architecture
The system is a monorepo (frontend, backend, packages/shared) that turns one long video into many vertical shorts. The API is a control plane only: it authenticates, issues presigned MinIO URLs, enqueues jobs, and serves read models. All heavy work runs in BullMQ workers backed by Redis, with PostgreSQL as the source of truth for projects, jobs, transcripts, clips, and an append-only job_events log.
Blobs never land in Postgres — only bucket + objectKey on assets. Workers download to temp disk, call FFmpeg / Whisper / OpenRouter, upload results, then append events the UI can poll.
Deploy invariant: HTTP request lifecycle must not run probe, transcribe, analyze, or render. Docker Compose runs one API container and six worker containers so CPU-heavy FFmpeg and GPU/CPU Whisper do not starve the API.
Engineering details
- Dual-layer jobs — Every enqueue creates a
processing_jobsrow (idempotencyKeyunique), thenqueue.addwithjobId = row.id. Workers no-op if status is alreadysucceeded. Retries get a new key suffix (:timestamporreprocess:…). - Queue chaining — Happy path:
media.probe→media.audio_extract→transcription.whisper→transcript.normalize→ai.analyze_transcript→ optional N×render.clip→export.zip. - AI pipeline — Per-chunk scout (Mastra + OpenRouter) → deterministic dedupe → global ranker → render planner → optional JSON repair →
BoundaryRefineron word/silence timings →clip_candidates+ render plans. - Storage split — MinIO internal client for workers; separate public endpoint for browser presigned PUT/GET (
MINIO_PUBLIC_ENDPOINT). - Project FSM —
projects.statustransitions are explicitly whitelisted; illegal transitions return 400 before workers run. - Observability —
job_eventsper stage +ai_runs(input hash, prompt version, tokens, validation errors). Admin jobs UI lists queue depth via sharedQUEUE_NAMES.
Subsystems
| Layer | Responsibility |
|---|---|
| Next.js frontend | Auth, upload to MinIO, status polling, transcript/clips review, export download |
| NestJS API | RBAC, upload sessions, enqueue, cancel/reprocess, health (DB + Redis + MinIO) |
| Workers | FFmpeg probe/extract/render, faster-whisper, LLM workflow, ZIP export, cleanup sweeps |
| Postgres | Metadata, transcripts, processing_jobs, job_events, ai_runs, clip_candidates |
| Redis | BullMQ transport only (not pub/sub fan-out) |
| MinIO | Raw video, audio, transcripts, rendered MP4s, exports |
Failure modes
| Risk | Mitigation / gap |
|---|---|
| Workers stopped | Jobs stay queued in Postgres; no in-app alert |
| Bull enqueue after DB insert | Row without bullmqJobId — no reconciler in code yet |
| Long video analyze | ~1 LLM scout call per ~7 min chunk — linear cost |
| Render retry | New render via clips API, not generic job replay |