Agent coordination review

Your agents are running. Are they working?

Throughout Phase 26, you have built a multi-agent cognitive system. You have protocols for communication between agents (L-0507), orchestration patterns (L-0508), context handoffs (L-0509), dependency maps (L-0510), and conflict resolution mechanisms (L-0503). You learned how to add agents carefully (L-0517) and remove them cleanly (L-0518). Each lesson addressed a specific coordination problem.

But here is the question none of those lessons can answer on their own: right now, today, how well is your actual system working?

You may have fifteen cognitive agents operating — habits, routines, delegation structures, automated pipelines, structured practices. Each one was designed with intent. Each one has run long enough to become familiar. And that familiarity is precisely the danger. The moment an agent becomes routine is the moment you stop observing whether it still coordinates with the rest of your system. You stop checking whether outputs reach their intended consumers. You stop noticing that two agents are duplicating effort or that a critical handoff broke three weeks ago and nobody — including you — flagged it.

A coordination review is the mechanism that prevents this drift. It is a periodic, structured assessment of how well your agents work together as a system — not whether individual agents perform, but whether the system they constitute is coherent.

The metacognitive foundation: monitoring your monitors

The coordination review is an exercise in metacognition — the cognitive process of thinking about your own thinking. John Flavell introduced the concept in 1979, distinguishing between metacognitive knowledge (what you know about your own cognitive processes) and metacognitive monitoring (your ability to observe those processes in real time and evaluate their effectiveness). The coordination review targets monitoring specifically: you are not learning new coordination techniques. You are assessing whether the techniques you already deployed are actually functioning.

This distinction matters because most people conflate having a system with having a working system. Flavell's research, and the substantial body of work that followed, demonstrated that metacognitive monitoring is a trainable skill — and that people who engage in structured self-assessment outperform those who rely on intuition about their own performance. A 2020 evidence review by the Education Endowment Foundation confirmed that metacognitive strategies, including periodic self-assessment of learning processes, consistently produce measurable performance gains across domains.

The coordination review applies this principle at the system level. You are not monitoring a single cognitive process. You are monitoring the interactions between multiple processes — which is harder, less intuitive, and more consequential. A single underperforming agent costs you the output of that agent. A broken coordination pattern between agents costs you the output of every downstream process that depended on the handoff.

After-action review: the military prototype

The most rigorously studied coordination review protocol in existence is the After-Action Review (AAR), developed by the U.S. Army in the mid-1970s and formalized through the National Training Centers. The AAR asks four questions: What was supposed to happen? What actually happened? Why was there a difference? What will we do differently next time?

The simplicity is deceptive. A meta-analysis published by Scott Tannenbaum and Christopher Cerasoli found that well-conducted debriefs improve team effectiveness by approximately 25% across a wide range of organizations and settings. The key qualifier is "well-conducted." The research consistently shows that unstructured reflection — sitting around and talking about how things went — produces minimal improvement. What produces results is structured comparison between intended and actual outcomes, specific identification of causal factors, and explicit commitment to behavioral change.

The AAR became consequential during the Gulf War, where it emerged organically. Small groups of soldiers gathered in foxholes and around vehicles to review their most recent missions and identify improvements. The practice spread because it worked — not because anyone mandated it. Units that conducted AARs adapted faster, made fewer repeated errors, and coordinated more effectively than units that did not.

Your cognitive agent system is not a military unit. But the structural problem is identical: you have multiple actors (agents) executing plans in a shared environment (your life), and without periodic structured review, coordination failures accumulate silently until the system degrades past the point of easy repair.

The four questions of a coordination review

Translating the AAR structure into a cognitive agent coordination review produces four specific questions. These are not open-ended reflection prompts. They are diagnostic instruments.

Question 1: Coverage. Which agents produced output this review period, and which did not? An agent that was designed to run weekly but has not produced output in three weeks is either broken, unnecessary, or blocked. Each answer implies a different intervention. This question surfaces dormant agents before you forget they were supposed to exist.

Question 2: Handoff integrity. For every agent that produced output, did that output reach its intended consumer? Your reading pipeline is supposed to feed tagged highlights into your writing practice. Did it? Your weekly plan is supposed to constrain your daily task selection. Did today's tasks actually reference this week's plan? Handoff failures are the most common and most invisible coordination breakdowns. They do not generate errors. They generate absence — the downstream agent simply operates without the input it was designed to receive, and you never notice because the downstream agent still runs. It just runs worse.

Question 3: Overhead ratio. How much time did you spend on coordination itself versus the work the agents are supposed to produce? Coordination overhead is a legitimate cost — L-0514 covered this explicitly. But overhead should be proportional to the coordination benefit it enables. If you spend forty-five minutes organizing your task system to produce thirty minutes of focused work, the overhead ratio is inverted. The review should surface cases where coordination mechanisms have become more expensive than the coordination they provide.

Question 4: Coordination failures. Where did agents actively interfere with each other? This is different from handoff failures (missed connections) — this is about conflicts. Your deep-work block and your meeting schedule are in direct contention for the same time resource. Your journaling practice and your reading habit both claim the first hour of the morning. These conflicts may have been resolved in theory (L-0503 covered priority ordering) but the resolution may have broken down in practice. The review surfaces the gap.

System 2 auditing System 1: Kahneman's lens

Daniel Kahneman's dual-process framework from Thinking, Fast and Slow (2011) provides a useful lens for understanding why coordination reviews are necessary and why they resist becoming automatic.

Your cognitive agents, once established, operate primarily through System 1 — fast, automatic, effortful only at setup. Your daily writing habit does not require deliberation once it is established. Your weekly review runs on autopilot. This is by design: the entire purpose of building agents is to delegate cognitive work from deliberate attention to automated execution.

But coordination between agents is a System 2 function. It requires the slow, deliberate, effortful processing that evaluates whether automatic processes are still serving their intended purpose. The coordination review is a scheduled System 2 intervention into a System 1 operational landscape. You are temporarily stepping out of execution mode to evaluate the execution system itself.

Kahneman's research demonstrated that System 2 is lazy — it defaults to accepting System 1's outputs unless something triggers deliberate engagement. This means that without a scheduled review, you will not spontaneously notice coordination failures. System 1 does not flag its own breakdowns. It just keeps running the pattern, even when the pattern no longer serves the system. The review is the forcing function that activates System 2 monitoring on a reliable cadence.

The AI parallel: multi-agent evaluation frameworks

In AI systems engineering, the problem of evaluating multi-agent coordination has become a first-order research concern. As LLM-based multi-agent systems have moved from research prototypes to production deployments, the field has discovered that individual agent performance does not predict system performance. An agent that scores highly on benchmarks in isolation may degrade overall system output when its communication overhead or context consumption interferes with other agents.

The MultiAgentBench framework, published in 2025, introduced milestone-based key performance indicators that measure not just task completion but the quality of collaboration between agents — communication efficiency (how effectively agents exchange information) and decision synchronization (whether agents align their actions to optimize outcomes). The CLEAR framework added cost, latency, efficiency, assurance, and reliability as coordination-specific metrics, recognizing that a multi-agent system is not just a collection of agents but a coordination topology with its own emergent properties.

The research finding most relevant to your personal system is this: architectural choices in multi-agent coordination — how agents are connected, how information flows between them, how conflicts are resolved — produce greater performance differences than the capabilities of individual agents. Studies have shown that coordination topology alone can produce over 100x differences in latency and up to 30% absolute changes in accuracy. The agents matter. How they coordinate matters more.

This is exactly the case for your cognitive agent system. The quality of your individual habits, routines, and practices is important. But the quality of their coordination — the handoffs, the sequencing, the shared context, the conflict resolution — determines whether the system produces coherent output or just generates activity.

A protocol for running the review

Here is a concrete protocol you can execute. It requires thirty minutes and no special tools beyond something to write with.

Step 1: Agent inventory (5 minutes). List every cognitive agent you are currently running. Include habits, routines, recurring processes, delegation structures, automated pipelines, and structured practices. Do not filter for importance — list everything. The review cannot assess agents it does not know about.

Step 2: Status check (5 minutes). For each agent, mark whether it ran as designed during the review period. Active, degraded, or dormant. Be honest. An agent you skipped three times is degraded, not active. An agent you have not run in two weeks is dormant, not "flexible."

Step 3: Handoff audit (10 minutes). For each active agent, trace its outputs. Where does the output go? Did it actually arrive? Pick the three most important handoffs in your system and verify them with evidence, not memory. Check whether this week's plan actually appeared in today's task list. Check whether your reading highlights actually made it into your notes system. Evidence, not impression.

Step 4: Failure identification and repair commitment (10 minutes). Based on steps 1-3, identify the single most consequential coordination failure. Define a specific repair: a concrete change to a handoff, a sequence, a conflict resolution, or an agent design. Write it down. Commit to implementing it before the next review.

This protocol is deliberately minimal. A thirty-minute review that produces one concrete repair every cycle will, over twelve months, address twenty-four to fifty-two coordination failures — depending on whether you run weekly or biweekly. That is the compound effect of structured assessment. Not dramatic transformation, but relentless incremental repair of the connections between your agents.

From individual agents to coherent system

You have spent nineteen lessons learning how to build, manage, and maintain a multi-agent cognitive system. You learned communication protocols, orchestration patterns, conflict resolution, dependency mapping, ecosystem health, and the careful art of adding and removing agents. Each lesson was a component skill.

The coordination review is the mechanism that integrates those skills into a practice. It is the recurring checkpoint that prevents your system from degrading through drift, accumulating coordination debt, or silently breaking the handoffs that connect individual agents into a coherent whole.

Without the review, your agents are a collection. With it, they are a system.

The final lesson of Phase 26 (L-0520) will show you what the system produces when coordination is working: the experience that others perceive as effortless competence. That experience is not talent. It is not luck. It is the observable output of agents that have been repeatedly reviewed, repaired, and refined until they work together without friction. The coordination review is how you get there — one thirty-minute audit at a time.

Sources:

Flavell, J. H. (1979). "Metacognition and cognitive monitoring: A new area of cognitive-developmental inquiry." American Psychologist, 34(10), 906-911.
Muijs, D. (2020). "Metacognition and Self-Regulation: Evidence Review." Education Endowment Foundation.
Tannenbaum, S. I., & Cerasoli, C. P. (2013). "Do team and individual debriefs enhance performance? A meta-analysis." Human Factors, 55(1), 231-245.
Morrison, J. E., & Meliza, L. L. (1999). "Foundations of the After Action Review Process." U.S. Army Research Institute.
Kahneman, D. (2011). Thinking, Fast and Slow. Farrar, Straus and Giroux.
Liu, M., et al. (2025). "MultiAgentBench: Evaluating the Collaboration and Competition of LLM Agents." Proceedings of ACL 2025.
Carver, C. S., & Scheier, M. F. (1998). On the Self-Regulation of Behavior. Cambridge University Press.