From AI Policy to AI Proof: The Audit Is Coming; Michael Housch

In twelve months, regulators won't be asking if you have an AI policy. They'll be asking for proof it works. That's the shift I'd plan around right now; and it's not speculative. The trajectory is already visible to anyone watching the examiners.

We're seeing the change in real time. The NIST AI Risk Management Framework, the EU AI Act, and updated FFIEC guidance are moving from frameworks you reference to expectations you have to evidence. If you operate in financial services, healthcare, or any other regulated space, "we're working on it" is going to stop being an acceptable answer.

The Shift From Frameworks to Expectations

For the last few years, an AI program could earn credit simply by gesturing at the right documents; a policy that cited NIST, a slide that mentioned the EU AI Act, a committee that met quarterly. That era is closing. The frameworks themselves are hardening into the questions examiners ask, and the answers they expect are no longer aspirational.

Framework	Where It's Headed	What It Increasingly Expects
NIST AI RMF	Voluntary, but the de facto reference	Demonstrated Govern / Map / Measure / Manage functions, not a claim of adoption
EU AI Act	In force, obligations phasing in	Risk classification, technical documentation, and human oversight for higher-risk systems
FFIEC / US banking	Guidance evolving into exam scope	Model risk discipline extended to AI; third-party and oversight evidence on request

In twelve months, regulators won't be asking if you have an AI policy. They'll be asking for proof it works. The organizations building that proof now will be fine. The ones waiting for the audit letter are going to have a very bad year.

What "Proof" Actually Looks Like

Here's what I think examiners will actually want to see; not the policy, but the evidence behind it. Four things, specifically.

⬡ What They'll Ask You to Produce

An AI inventory. Not a spreadsheet someone made last quarter. A live, auditable record of every model in production and the decisions each one is touching; current enough that you'd hand it over without flinching.
Bias and fairness documentation. Especially for any AI that influences credit, access, or outcomes. Tested, dated, and refreshed; with results you can defend, not a one-time assertion that the model is fair.
Incident response plans specific to AI. Not your existing IR playbook with "AI" added to the title. Built for the failure modes that are actually new: model drift, data poisoning, prompt injection, and agents acting outside their intended scope.
Evidence of human oversight. Particularly for agentic systems making autonomous decisions. Who can intervene, at what point, under what authority; and a log that proves the oversight happened, rather than a policy that says it should.

The Difference Is Whether It Holds Up

The line between a program that passes and one that doesn't isn't sophistication. It's whether the controls produce evidence on demand. Same policy, two very different audits.

✓ Holds Up Under Audit

An inventory you can query live, not reconstruct from memory
Fairness testing with results, dates, and owners
AI incident runbooks that have actually been exercised
Oversight logged against named, accountable humans
A clear record of who approved what, and when

✗ Falls Apart Under Audit

A policy PDF and a spreadsheet from two quarters ago
"We follow NIST" with nothing to put behind it
A generic IR plan with "AI" bolted onto the cover
Oversight asserted but never evidenced
No provenance for the decisions the model made

The Bottom Line

None of this is exotic. It's the unglamorous, operational work of turning a policy into a control that runs and a record you can produce. The organizations building it now will be fine. The ones waiting for the audit letter to start are going to spend that year reconstructing evidence under pressure; which is the worst possible time to build a governance program.

So a question worth sitting with, honestly, before someone external asks it for you: of these four; inventory, fairness, AI-specific incident response, and provable human oversight; which is your organization furthest behind on?

build the proof --before-the-audit-letter