Sergey Orsik.dev

2026-03-18

Agent System Design Thoughts

Why agents are workflows, not loops — and how to keep tool use safe.

Agents are workflows

An agent "loop" in production is really a durable workflow with:

Explicit step boundaries
Timeouts per step
Compensation on failure

Cron + scripts fail silently. Temporal (or similar) gives you history you can audit.

Tool gateway pattern

Never let the LLM call arbitrary URLs. Route through a gateway that:

Validates payload against JSON Schema
Applies per-tenant rate limits
Attaches trace IDs and logs structured results

Planner vs executor

Split responsibilities:

Planner proposes steps (LLM)
Executor validates and runs (code)

The runtime should reject plans that reference unknown tools or exceed step budgets.

Failure modes I've seen

Symptom	Root cause	Fix
Infinite retries	No max attempts on activity	Cap + dead-letter queue
Nondeterministic replays	Random IDs in workflow code	Deterministic IDs only
Tool hallucination	Schema too loose	Tighten required fields

Takeaway

Ship the smallest agent that solves one workflow end-to-end. General-purpose agents are a research problem, not a v1 product feature.