Scale AI output verification to stakes: skim for brainstorming, spot-check for communications, verify every claim for publication
For high-stakes AI outputs, adopt a three-tier verification intensity: skim for low-stakes brainstorming, spot-check key claims for medium-stakes communications, and verify every substantive claim for high-stakes published or production content.
Why This Is a Rule
AI systems produce outputs with a distinctive error profile: high coherence, variable accuracy. AI-generated text reads smoothly and confidently regardless of factual accuracy. This means the usual "does it look right?" heuristic fails — incorrect AI output looks as polished as correct output. Verification intensity must therefore be calibrated to the consequences of an undetected error, not to the perceived quality of the output.
Low stakes (brainstorming, idea exploration): skim for general direction. Errors here are cheap — a wrong idea gets filtered in later stages. Over-verifying brainstorming kills the speed advantage that makes AI useful for ideation. Medium stakes (internal communications, drafts, analysis): spot-check key claims and numbers. Errors here can mislead colleagues or distort analysis, but are catchable before they propagate externally. Verify the load-bearing claims; accept the rest at face value. High stakes (published content, production code, client deliverables, financial analysis): verify every substantive claim independently. Errors here damage reputation, create liability, or produce real-world harm. The AI's confidence provides zero assurance — only independent verification does.
When This Fires
- Every time you use AI output for any purpose beyond personal notes
- When deciding how much review time to allocate to AI-generated content
- When training others on responsible AI-assisted workflows
- When the cost of AI-error propagation varies by context
Common Failure Mode
Uniform verification across all stakes: either verifying everything exhaustively (destroying AI's speed advantage for low-stakes tasks) or spot-checking everything casually (risking undetected errors in high-stakes outputs). The stakes determine the verification intensity, not the perceived quality of the AI output.
The Protocol
(1) Before using any AI output, classify the stakes: Low (brainstorming, exploration, personal notes): errors are costless or quickly caught downstream. Medium (internal communications, working drafts, preliminary analysis): errors mislead but are correctable before external impact. High (publications, client deliverables, production deployments, financial decisions): errors produce external harm, reputational damage, or legal liability. (2) Apply the matched verification: Low → skim for general direction. Accept without deep verification. Medium → spot-check the 3-5 most important claims. Verify numbers, key facts, and logical flow. Accept supporting details. High → independently verify every substantive claim. Check facts against primary sources. Validate logic. Test code. The AI output is a draft, not a deliverable. (3) When in doubt about stakes classification → treat as one level higher than your initial assessment.