Purpose
Non-deterministic LLM outputs need production debugging: which prompt version, which model, and which input produced a bad clip boundary. The system treats AI as metered, auditable batch jobs — not fire-and-forget HTTP.
ai_runs lifecycle
Each agent step in AnalyzeTranscriptWorkflowService follows the same pattern:
- Resolve prompt —
promptVersions.resolveActive(promptKey)→ FK toprompt_versions. - Open run —
aiRuns.startRun({ workflowName: 'analyze_transcript', agentName, inputHash, inputPreview, status: 'running' }). - Call model —
ai.generateStructured(...)via OpenRouter / Mastra. - Close run —
aiRuns.finishRunwithsucceeded|failed|repaired, plus optionaloutputJson,validationErrorsJson, tokens, andlatencyMs.
Input fingerprint
inputHash = sha256(JSON.stringify(input))— dedup analysis and a hook for future cache layers (cache not implemented today).inputPreview— first 600 chars for admin UI; full chunk text stays intranscript_chunks.
Prompt registry
| Piece | Role |
|---|---|
PROMPT_DEFINITIONS + PROMPT_KEYS | Central agent names and instructions |
PROMPT_VERSION | Bumps tie into idempotency keys (analyzeTranscriptKey(..., promptVersion)) |
ensureRegistered on workflow start | Creates DB rows when keys are missing |
Structured output
Zod schemas (chunkScoutOutputSchema, clipRankerOutputSchema, clipPlanOutputSchema, jsonRepairOutputSchema) validate every response before persisting candidates.
Invalid scout / ranker / planner → failed run + AiAnalysisError with stable error codes for API consumers.
JSON repair path
A separate run with status repaired when the fix agent succeeds — separates first-pass planner failure from recovery for support triage.
Product linkage
clip_candidates.sourceAiRunId points to the planner run, not scout runs. Scouts are many-to-one per project analyze job.
Operational invariant: every LLM invocation must leave an ai_runs row before side
effects touch clip_candidates. If you cannot replay from the row, the step is not
production-ready.
Failure modes
| Risk | Impact |
|---|---|
| Partial scout success, then job failure | Orphan succeeded ai_runs for early chunks; rerun may duplicate scout spend unless idempotency blocks |
inputPreview carries PII | Chunk snippet in DB — fine for internal admin; consider redaction for multi-tenant |
AI_USE_FIXTURE=true in prod | Deterministic fake clips — requires env guard |
Tradeoff
Per-chunk scout runs improve quality and context fit but multiply cost vs single-pass summarization. tokenEstimate on chunks is the hook for future budgeting dashboards.