Sergey Orsik.dev
← notes

2026-03-18

Agent System Design Thoughts

Why agents are workflows, not loops — and how to keep tool use safe.

Agents are workflows

An agent "loop" in production is really a durable workflow with:

  • Explicit step boundaries
  • Timeouts per step
  • Compensation on failure

Cron + scripts fail silently. Temporal (or similar) gives you history you can audit.

Tool gateway pattern

Never let the LLM call arbitrary URLs. Route through a gateway that:

  1. Validates payload against JSON Schema
  2. Applies per-tenant rate limits
  3. Attaches trace IDs and logs structured results

Planner vs executor

Split responsibilities:

  • Planner proposes steps (LLM)
  • Executor validates and runs (code)

The runtime should reject plans that reference unknown tools or exceed step budgets.

Failure modes I've seen

SymptomRoot causeFix
Infinite retriesNo max attempts on activityCap + dead-letter queue
Nondeterministic replaysRandom IDs in workflow codeDeterministic IDs only
Tool hallucinationSchema too looseTighten required fields

Takeaway

Ship the smallest agent that solves one workflow end-to-end. General-purpose agents are a research problem, not a v1 product feature.