The difference that matters
A concrete example. Imagine a sales operations workflow: incoming RFP arrives in email, gets logged in CRM, requires response drafted, gets reviewed by sales lead, sent back to client.
Rule-based automation
Can do: detect RFP email, log it to CRM with metadata, notify sales team, track response deadline. Cannot do: actually draft the response in your voice with relevant case studies and appropriate technical detail.
Agentic workflow
Can do everything rule-based automation does, plus: read the RFP, identify what's actually being asked, pull relevant past case examples from your knowledge base, draft a response in your house style, flag any ambiguities for human review, present the complete package for sales lead approval. The human reviews and sends; the agent did 70–80% of the work.
Where agents add value, and where they don't
Strong fit
- High-volume routine work with some judgment needed (lead screening, document review, support triage)
- Multi-step processes spanning several systems where context needs to be carried through (sales cycle management, project intake processing)
- Knowledge work bottlenecks where senior team time is the constraint (proposal drafting, executive summary generation, contract preliminary review)
- Content production pipelines with quality standards but predictable structure (industry research summaries, financial reporting, social content drafts)
Weak fit
- Genuinely strategic decisions requiring market judgment, competitive read, or stakeholder management
- High-stakes regulated activities where the cost of error massively exceeds the value of speed
- Creative work where the brief is "make something good" without clear quality criteria
- Highly variable processes where every instance is materially different and patterns are weak
Architecture patterns
Single-agent workflows
One agent handles a defined scope: read this, decide that, take this action. Simpler to deploy, easier to audit. Works for clearly scoped tasks like lead qualification or content categorisation.
Multi-agent orchestration
Multiple specialised agents coordinated by an orchestrator. One agent handles initial intake, another handles research, another handles drafting, another handles review. Works for complex multi-step processes where different stages need different specialised capabilities.
Human-in-the-loop workflows
Agents handle steps autonomously up to a checkpoint, where human review is required before proceeding. The checkpoint placement is the design decision — where do you want explicit human judgment, and where do you trust the agent. Higher-stakes workflows have more checkpoints; high-velocity routine workflows have fewer.
Asynchronous agent processes
Long-running processes where the agent works over hours or days, periodically updating status. Useful for research tasks, monitoring tasks, or pipeline processes where immediate response isn't needed.
Tooling and integration
Workflow orchestration
Common tools: n8n (open-source, self-hostable), Zapier (managed, easy), Make (managed, more powerful), or custom orchestration in Python or TypeScript for complex needs. Selection depends on complexity, hosting preferences, and team skills.
Agent frameworks
For the AI reasoning layer: LangChain or LangGraph (open-source agent frameworks), MCP (Model Context Protocol) for connecting agents to data sources and tools, or custom builds where requirements are unusual.
Data and tool connections
Agents need access to source systems (CRM, email, documents, knowledge bases) and target systems (where actions get taken). Connections via APIs, MCP servers, or direct database access depending on system architecture.
Monitoring and observability
Production agents need observability: what decisions are they making, what's working, what's failing, where are humans intervening most. Tools like LangSmith, Helicone, or custom logging infrastructure depending on scale.
How a typical engagement runs
Discovery (free, 30 minutes)
We map your current process, identify candidate workflows, and assess data and integration readiness. Honest assessment of whether agentic workflows make sense for your specific situation.
Workflow design (week 1)
For the highest-priority workflow, detailed design: process map, agent scope definition, checkpoint placement, success metrics. Written design document agreed with the client before any building starts.
Pilot build and deployment (weeks 2–6)
Initial agent built and deployed in shadow mode — making decisions visible alongside current process, but not taking action. Allows comparison against human decisions and tuning before live deployment.
Production deployment (weeks 7–8)
Agent moves to live execution with appropriate human-in-the-loop checkpoints. Performance monitoring, weekly reviews initially, monthly thereafter.
Expansion (ongoing)
Once one workflow works reliably, additional workflows added incrementally. The discipline: prove value at each step before expanding scope. Big-bang automation programmes that try to do everything at once consistently underperform incremental expansion.
Security and PDPL alignment
Agents accessing business data require careful architecture:
- Local LLMs for sensitive workflows (legal, financial, healthcare) where data shouldn't leave client infrastructure
- Zero-retention configurations when public APIs are used (OpenAI enterprise, Anthropic enterprise, Google Vertex)
- PII masking before any external API call
- Audit logging on every agent decision and action
- PDPL right-to-deletion compliance built into data handling
Frequently asked questions
What's the difference between agentic AI and regular automation?
Regular automation (Zapier, Make, traditional workflows) follows fixed rules: when X happens, do Y. Agentic AI handles steps that require judgment: review this document and decide what to do, draft a response in context, escalate ambiguous cases. The difference: automation executes; agents reason. Most production deployments combine both — automation for the rule-based steps, agents for the judgment steps.
What kinds of workflows are good candidates?
Three patterns work well: (1) high-volume routine work that requires some judgment (contract review checklists, lead screening, customer support triage); (2) multi-step processes where information must be gathered and decided across several systems (sales-cycle reporting, financial reconciliation, content publishing pipelines); (3) tasks currently bottlenecked by senior team time (proposal generation, report drafting, executive summaries). Tasks requiring genuine creativity or strategic judgment usually aren't good candidates.
Where does human-in-the-loop matter most?
Wherever the cost of error exceeds the cost of human review. Legal-adjacent work (contract terms, compliance assessments) typically requires human checkpoints. Financial-impact decisions above certain thresholds. Anything client-facing where brand voice matters. Anything regulatory. The discipline: design the workflow with explicit checkpoint placement before deployment, not after errors happen.
Can agents work across multiple systems?
Yes — this is often where they add the most value. Agents can read from one system (CRM, email, document store), reason about the contents, and take action in another system (creating tasks, drafting responses, updating records). The technical layer: API integrations, MCP (Model Context Protocol) servers, and orchestration frameworks. The strategic layer: clear scope definition for what the agent can and cannot do.
What's the realistic deployment timeline?
Simple single-task agents: 2–4 weeks. Multi-step workflows spanning 2–3 systems: 6–10 weeks. Complex orchestrations involving 5+ systems and multiple decision points: 12–20 weeks. We always start with the simplest version that delivers value, prove it works, then expand scope. Big-bang deployments that try to automate everything at once tend to fail.
What about errors and exceptions?
Three layers of error handling: (1) automatic retry for transient failures; (2) escalation to humans for ambiguous cases or low-confidence decisions; (3) audit logging so errors get reviewed and the system improves. The goal isn't zero errors — that's not achievable. The goal is: errors caught early, surfaced clearly, and used to improve the system over time.