The agent that does everything does nothing reliably
You've designed an agent. It handles your morning routine: wake up, hydrate, stretch, review your calendar, eat a high-protein breakfast, check your task list, and leave for work by 8:15. On paper it looks elegant. One agent, one trigger ("alarm goes off"), seven coordinated actions.
Then Tuesday happens. You wake up late. The stretching doesn't happen because you're rushing. You skip the calendar review to eat. You grab coffee instead of water because you're behind. By Wednesday the whole agent is dead. Not because any single behavior was hard, but because you wired seven independent concerns into one monolithic chain, and the first disruption brought down the entire system.
This is the most common architectural mistake in personal agent design. And it mirrors a mistake that software engineers, systems theorists, and cognitive scientists have been identifying for over sixty years.
The Unix lesson: do one thing and do it well
In 1978, Doug McIlroy — the inventor of Unix pipes — articulated the philosophy that would define a generation of reliable software: "Write programs that do one thing and do it well. Write programs to work together." This wasn't aesthetic preference. It was an engineering principle born from watching complex programs fail (McIlroy, 1978).
The Unix operating system became the most influential software architecture in history not because its individual tools were powerful, but because each tool was narrow. grep searches text. sort sorts lines. wc counts words. None of them try to do the other's job. When you need to find the ten most common words in a file, you compose them: pipe the output of one into the input of the next. Each tool fires reliably because each tool handles exactly one concern.
Peter Salus summarized the philosophy in A Quarter-Century of Unix (1994): write programs that do one thing well, write programs to work together, write programs to handle text streams because that is a universal interface. The third point matters — when each tool has a standard interface, composition is trivial. When each tool tries to be a Swiss Army knife, composition is impossible.
Robert C. Martin formalized the software equivalent as the Single Responsibility Principle: every module should have one, and only one, reason to change. When a module handles two concerns, a change to either concern can break the other. The failure rate doesn't add — it multiplies, because every concern interacts with every other concern in ways you didn't anticipate.
Your personal agents follow the same physics. An agent that handles both "I'm dehydrated" and "I haven't exercised today" has two trigger conditions that can conflict. When you're dehydrated and haven't exercised, which fires first? What if the action for one (drink 500ml of water) interferes with the action for the other (go for a run)? A narrow agent never faces this ambiguity. The dehydration agent fires when you're dehydrated. The exercise agent fires when you haven't moved. They don't know about each other, and they don't need to.
Complex systems survive through modularity
Herbert Simon's landmark paper "The Architecture of Complexity" (1962) introduced a concept he called near-decomposability: in any complex system that survives over time, elements interact primarily within their own subsystem and only weakly between subsystems. Simon demonstrated that near-decomposable systems evolve faster than monolithic ones because each subsystem can adapt independently. When one module fails, the failure stays local. When one module improves, the improvement doesn't require rewriting everything else.
Simon's insight applies far beyond software. Biological organisms are modular — your liver doesn't need to coordinate with your immune system to filter toxins, even though both systems operate in the same body. Ecosystems are modular — a change in the insect population of a forest doesn't require renegotiating the relationship between tree roots and soil fungi. The modularity isn't a design choice. It's a survival condition. Systems that lack it don't persist long enough to be studied.
Jerry Fodor extended this principle to the mind itself. In The Modularity of Mind (1983), he argued that cognitive input systems — vision, language processing, face recognition — are domain-specific modules. Each one operates on a specific class of information. Each one fires obligatorily when its input is present. Each one is fast precisely because it is informationally encapsulated: it doesn't need to consult your entire knowledge base to do its job. A face-recognition module doesn't check your beliefs about politics before identifying your friend in a crowd.
Fodor's modules succeed for the same reason Unix tools succeed: they are narrow. They handle one type of input and produce one type of output. The breadth of the system comes from composition, not from making any individual module broader.
Why multi-purpose agents break
When you build a multi-purpose agent, you create three specific failure modes:
Trigger ambiguity. A narrow agent has one trigger condition. When the condition is met, it fires. A multi-purpose agent has multiple trigger conditions that share a single activation point. The result is ambiguity: the agent activates, but you're not sure which sub-behavior is relevant right now. "Morning routine agent" activates when your alarm rings — but your actual situation (overslept, sick, traveling, weekend) determines which sub-behaviors apply. You spend cognitive energy figuring out which parts of the agent to run, which defeats the purpose of having an agent at all.
Cascading failure. When sub-behaviors are coupled into a single agent, failure in one propagates to the rest. You skip stretching, so you feel behind schedule, so you rush breakfast, so you skip the calendar review. A narrow agent that handles only stretching can fail without affecting your hydration agent or your calendar-review agent. Decoupled agents degrade gracefully. Coupled agents collapse entirely.
Resistance to modification. You want to change when you exercise — move it from morning to evening. If exercise is its own agent, you change the trigger time and nothing else is affected. If exercise is step four of a seven-step morning agent, changing it means restructuring the entire sequence, retesting the timing, and hoping nothing else breaks. Multi-purpose agents resist change because every modification has unpredictable interactions with every other component.
These aren't hypothetical. They're the same failure modes that software engineers encounter with monolithic architectures, that organizational theorists encounter with centralized bureaucracies, and that you encounter every time a "comprehensive morning routine" survives exactly four days.
The AI parallel: narrow tools that actually work
The artificial intelligence field has spent the last decade learning this lesson at industrial scale. Every narrow AI system that dominates its domain — DeepMind's AlphaFold for protein structure prediction, GitHub Copilot for code completion, Google's DeepL for translation — excels precisely because it handles one specific task. These systems don't try to be general. They are scoped to a single input type, a single output type, and a single evaluation criterion.
Meanwhile, the push toward artificial general intelligence has revealed a consistent pattern: models that try to be good at everything are mediocre at most things. As McKinsey noted in their 2025 analysis, "the next wave of AI value will come from domain-specific and industry-focused solutions that deeply integrate with business processes." Specialized AI agents outperform general models because they process information faster and with fewer errors within their focused domain (Averi, 2025).
The most striking development in AI engineering during 2025 wasn't bigger models — it was the agentic tool-use paradigm. Instead of building one massive model that handles every task internally, the industry moved toward orchestrating multiple specialized tools through protocols like Anthropic's Model Context Protocol (MCP). An AI agent doesn't try to search the web, write code, query a database, and generate images all with the same mechanism. It calls a specialized search tool for search, a code interpreter for code, a database connector for queries, and an image generator for images. Each tool does one thing well. The agent composes them.
This is Doug McIlroy's Unix philosophy, applied sixty years later at a different scale. The principle didn't change. The systems that work are narrow. The systems that try to do everything are fragile.
Narrow means specific, not simple
There's a misunderstanding worth preempting. Narrow doesn't mean trivial. A narrow agent can handle a complex situation — it just handles one complex situation. An agent for "responding when a direct report misses a deadline" might involve nuanced judgment about the person's track record, the deadline's importance, and your relationship with them. That's sophisticated behavior. But it's scoped to one trigger (missed deadline from a direct report) and one domain (management communication). It doesn't also try to handle your budgeting, your own missed deadlines, or your relationship with your manager.
Specificity is what makes sophistication possible. When an agent has a narrow scope, you can invest in making its behavior excellent within that scope. You can refine the trigger condition until it's precise. You can tune the action until it's effective. You can test it against edge cases within its domain. None of this is possible when the agent also handles five other concerns, because every refinement to one concern interacts with every other concern.
This is why Fodor's cognitive modules are fast — informational encapsulation means each module operates without consulting the full complexity of your knowledge. And it's why Unix tools are reliable — each tool operates without needing to understand the full complexity of the pipeline it's embedded in.
How to narrow an agent
Start with any agent you've designed (or want to design) and apply three tests:
The trigger test. Does this agent have exactly one trigger condition? If you find yourself writing "when X or when Y" or "when X and also Y," you have two agents wearing one name. Split them.
The action test. Does this agent produce exactly one type of output? If the action is "do A then B then C," ask whether A, B, and C could fire independently. If they can, they should. Each becomes its own agent.
The failure test. If one part of this agent fails, does the rest still make sense? If skipping step two makes step three impossible, the coupling might be necessary. But if skipping step two doesn't affect step four, they shouldn't be in the same agent.
A well-scoped agent reads like a Unix tool description: it does one thing, it has a clear input (the trigger), and it produces a clear output (the action). "When I notice I've been sitting for 90 minutes, I stand and walk for five minutes." One trigger. One action. No ambiguity about when it fires. No cascading failure when it doesn't. No interference with your hydration agent, your focus agent, or your meeting-prep agent.
Composition replaces comprehensiveness
The fear behind multi-purpose agents is understandable: if I decompose my morning into seven separate agents, won't I lose the coherence of the routine? The answer is no — you gain something better. You gain composition.
Seven narrow agents that each fire independently are more reliable than one comprehensive agent, because each can succeed or fail on its own terms. On a rushed morning, maybe five of seven fire. That's five successful behaviors instead of zero (which is what you get when the monolithic agent collapses at step two). On a perfect morning, all seven fire — and the result looks identical to the comprehensive routine, except each component earned its place by firing reliably on its own.
This is Simon's near-decomposability in practice. This is McIlroy's Unix philosophy applied to your cognitive infrastructure. This is the principle that every reliable complex system in nature, in software, and in your own mind already follows: the components that survive are the ones scoped narrowly enough to function independently, composed loosely enough to work together, and specific enough to be tested, refined, and trusted.
Your agents should be narrow. Not because narrow is easy, but because narrow is what actually works.