Sergey Orsik.dev
← notes

2026-05-16

Worker Process Isolation by Domain

Six separate Node worker entrypoints consume disjoint BullMQ queues so CPU-heavy FFmpeg, GPU transcription, and LLM calls do not starve each other.

Topology (inferred from worker-bootstrap.ts + Compose)

ProcessQueuesTypical bottleneck
media.workermedia.probe, media.audio_extractFFmpeg, disk I/O
transcription.workertranscription.whisperPython faster-whisper subprocess
ai.workertranscript.normalize, ai.analyze_transcriptOpenRouter latency, token cost
render.workerrender.clipFFmpeg filter graphs, CPU
export.workerexport.zipZip + large downloads
maintenance.workermaintenance.cleanupScheduled sweeps

Each process boots WorkerAppModule (NestJS application context without HTTP) and registers BullMQ Worker instances with configurable concurrency from env (workers.*Concurrency).

Why not one mega-worker

  1. Resource isolation — Render and probe both use FFmpeg; separate pools prevent probe latency from blocking exports.
  2. Scaling — Compose can scale worker-render replicas independently (inferred ops pattern; single replica in default Compose).
  3. Failure blast radius — OOM in whisper worker does not take down API.
  4. Dependency packaging — Transcription worker image includes Python + faster-whisper; AI worker needs Mastra/OpenRouter only.

Maintenance scheduler

maintenance worker owns a second BullMQ Queue used as a ticker: setInterval enqueues maintenance.cleanup jobs with time-based idempotency keys (maintenance.cleanup:${Date.now()}). Concurrency = 1 avoids overlapping sweeps.

Shared infrastructure

All workers share:

  • DATABASE_URL (Prisma)
  • REDIS_URL (BullMQ connection from RedisService.getBullMqClient())
  • MinIO credentials
  • WORKER_TEMP_DIR for per-job temp directories (createJobTempDir + cleanupTempDir)

Graceful shutdown

SIGINT / SIGTERM → close all Worker instances → close maintenance queue → app.close()process.exit(0).

Failure mode

If only API is up and workers are down, jobs accumulate in Redis as queued DB rows — UI shows stalled pipeline with no automatic alert in code (inferred ops gap).

See diagram: diagrams/system-overview.mmd.