Most beginners build only layers 2 and 3 (LLM + logic) and forget the other 4. The full diagram of an agent stack that runs in production — Input, Orchestration, LLM, Tools, Output, Infrastructure.
Map of an agent that actually works in production — every layer, what it does, and the popular tools at each layer.
┌─────────────────────────────────────────────────────────────────┐ │ AI AGENT — 6 LAYER STACK │ └─────────────────────────────────────────────────────────────────┘ 1. INPUT LAYER ────────────────────────────────────────────────── What triggers the agent • User prompt (chat interface) • Webhook event (form submit, API call) • Cron job (scheduled task) • File watcher (new email, new ticket) ↓ 2. ORCHESTRATION LAYER ─────────────────────────────────────────── The "brain" — decides what to do next • LangChain · LangGraph · CrewAI · Pydantic AI • Loop: Plan → Act → Observe → Repeat • Decides: call LLM? call tool? finish? ↓ 3. LLM + MEMORY LAYER ──────────────────────────────────────────── Reasoning + state • LLM: Claude / GPT / Gemini • Short-term: conversation history (context window) • Long-term: vector DB — Pinecone / Weaviate / pgvector • Prevents the "amnesia" problem between calls ↓ 4. TOOLS LAYER ─────────────────────────────────────────────────── What the agent can DO • Web search (Brave, Tavily, SerpAPI) • Code execution (E2B, Modal sandboxes) • Email send/read (Gmail API, SendGrid) • File operations, DB queries, custom APIs • Each tool = a function the LLM can call ↓ 5. OUTPUT LAYER ────────────────────────────────────────────────── The result • Structured response (JSON, markdown, HTML) • Trigger to another system (webhook out) • Human-readable text • File generation (PDF, doc, image) ↓ 6. INFRASTRUCTURE LAYER ────────────────────────────────────────── What keeps it running in production • Queue: Redis, BullMQ, Celery (for async tasks) • Logging: LangSmith, Helicone, Langfuse (observability) • Deployment: Docker, Modal, AWS Lambda, Cloud Run • Auth + rate limiting + cost tracking
[ ] Input
Is there an automated trigger (not just a button)?
If a human has to click "run" — it's a script, not an agent.
[ ] Orchestration
Is there clear logic for "when to stop"?
Without termination criteria, agents loop forever and burn money.
[ ] Memory
Is there a vector DB or just the context window?
Context window alone = the agent forgets between sessions.
[ ] Tools
At least 2 distinct tools the agent can call?
One tool = a function call. Two+ = an agent.
[ ] Output
Does the result flow to another system or just sit on screen?
"Sits on screen" = it's a chatbot, not an agent.
[ ] Infra
Queue + Logging + Error handling all in place?
Without these three: the agent works on day 1, breaks on day 7.
──────────────────────────────────────────
Common mistakes I see in 8/10 agent projects:
- Skipping memory layer → agent has amnesia
- No tool integration → just a fancy chatbot
- No queue → blocking calls take down the app
- No logging → impossible to debug when it fails
- Cost runaway → no rate limit on LLM callsSet up once. Then it runs by itself.
Decide: what makes the agent run? Webhook, cron, file watch. If there's no automated trigger — build that first.
LangGraph if you're serious; CrewAI if you want to move fast. Don't write a custom agent loop.
Claude/GPT for thinking, Pinecone or pgvector for long-term memory. Without a vector DB the agent forgets every session.
At least 2 distinct tools. Without that — it's a chatbot, not an agent. Examples: web search, code exec, gmail, custom API.
Where does the result go? Webhook out, email a customer, DB record. If it just sits on the screen — the agent isn't integrated.
Queue (Redis/BullMQ), Logging (LangSmith), Deployment (Docker/Modal). Without this trio, the agent works for a week then collapses.
Once a week — a new episode and a prompt or template like this one. No spam, unsubscribe anytime.
Your details are never shared.