When AI retrieval degrades, check if notes are self-contained — fragments confuse AI
When AI retrieval quality degrades despite good source material, diagnose whether notes are self-contained units or fragments requiring external context, because fragmentation produces context confusion that corrupts AI reasoning.
Why This Is a Rule
AI retrieval systems (RAG, semantic search, copilots) work by finding relevant notes and feeding them to the language model as context. When a note is self-contained — it includes enough context to be understood on its own — the AI can reason about it accurately. When a note is a fragment — it requires other notes, headings, or implicit context to make sense — the AI receives a piece without the puzzle, and its reasoning goes wrong in subtle, confident ways.
The most common fragmentation pattern: notes that make sense within their parent document but not in isolation. "The third approach solves this by inverting the dependency" — "this" refers to a problem stated three notes ago, "third approach" implies two others that aren't present, and "the dependency" is defined elsewhere. A human reading the full document follows the references. An AI receiving this note in isolation confabulates the referents.
When retrieval quality degrades despite good source material, fragmentation is the most common cause. The material is good — but the individual retrieval units are not self-contained.
When This Fires
- AI retrieval from your knowledge base produces confidently wrong answers
- Semantic search returns relevant notes but AI misinterprets them
- You've invested in good content but AI copilot quality is inconsistent
- Any time AI reasoning about your notes feels "off" despite the notes being accurate
Common Failure Mode
Blaming the AI model or the embedding quality when the problem is note structure. You switch to a better model or re-embed with a different algorithm, but quality doesn't improve — because the fragments are still fragments. The fix is structural: make each note self-contained, not improve the retrieval algorithm operating on broken units.
The Protocol
When AI retrieval quality degrades: (1) Pull 5-10 notes that the AI recently retrieved and misinterpreted. (2) Read each note in isolation — without seeing surrounding notes. (3) For each, ask: "Can I understand this note without any external context?" (4) Notes with dangling references ("this," "the above," "as mentioned") or implicit context are fragments. (5) Rewrite fragments to be self-contained: replace pronouns with specific nouns, include the key context inline, and ensure each note can be understood by a reader (or AI) encountering it for the first time.