The symptom everyone recognizes
AI governance councils, responsible-AI committees, and risk reviewers spend hours on prompt wording, red-team transcripts, and output samples. Policy PDFs sit in the same room—but the review artifact that actually changes before launch is almost always the prompt or the agent instruction block.
That is not because teams prefer prompts. It is because prompts are the only governance surface the runtime exposes.
Why policies do not enter the review loop
- Policy is prose; the agent consumes tokens, not approval matrices.
- No tooling shows whether a policy clause compiled into a checkable constraint.
- Updates to policy do not automatically invalidate or re-verify agent configurations.
- Auditors ask for traceability; the team can show prompt versions, not governance lineage.
Reviewing prompts is manual Governance Compilation performed in meeting rooms—slow, non-portable, and reset on every model change.
What changes when compilation exists
When governance artifacts become first-class objects, review shifts upstream:
- Does the governance contract match the approved policy version?
- Are authority thresholds and evidence gates encoded—not paraphrased?
- Does deployment produce governance lineage auditors can query?
Prompt review remains useful for tone and task framing. It stops being the sole carrier of institutional authority.
Implication for leaders
If your governance program metrics are prompt-centric—word counts, jailbreak tests, sample outputs—you are measuring translation quality without measuring whether translation was necessary.
The strategic question is not "How do we review prompts better?" It is "What would let us review policies as compiled constraints instead?"