What the OMB AI Guidance Doesn't Tell You About Implementation

The federal AI policy framework now has more text in it than at any point in its history. M-24-10 set the original architecture in March 2024.¹ M-25-21 reframed it around innovation and public trust in April 2025, superseding much of the prior memo.² M-25-22 set the procurement baseline alongside it.³ The NIST AI Risk Management Framework and its Generative AI Profile sit underneath as the technical reference layer.⁴ Every covered agency has a Chief AI Officer. Every agency publishes an annual use case inventory.⁵ By weight of paper, the regime is now substantial.

By weight of implementation guidance, it is not. The memos define what agencies must do; they do not define how. The translation from policy text to operational practice is being done one program at a time, by agency teams improvising against a framework that does not specify procedure. The patterns emerging across agencies further down the curve are not in the memos. They are in the procurement clauses, the impact-assessment templates, the inventory definitions, and the records-disposition decisions that the policy framework points to but does not contain.

What the framework says, and what implementation actually requires

The OMB guidance is at its strongest when it sets principle. It is at its quietest when it would have to define operational detail.

Chart 01 · Stated vs implementation reality

What the memos say, and what implementation actually requires.

Principle is well-defined. Procedure is not. The translation is done one program at a time.

As stated in policy

→

What implementation requires

AI InventoryAgencies shall publish an annual inventory of their AI uses, with required metadata fields.

GAP

Granularity definitionDoes one underlying model serving four workflows count as one entry or four? The agencies are deciding individually; the inventories already vary in shape.

High-impact designationAgencies must designate AI uses as high-impact and apply elevated controls accordingly.

GAP

Determination procedureWho decides, on what evidence, with what review gate, against what threshold. Each agency invents the procedure separately.

AI Impact AssessmentConduct an AI Impact Assessment for designated uses before deployment.

GAP

Reconciliation with PIA / FISMA / NARAHow the AIA relates to existing privacy, security authorization, and records-disposition assessments is reconciled program by program.

Procurement controlsAgency AI procurements shall include clauses addressing performance, data rights, provenance, model updates, and end-of-contract data disposition.

GAP

Clause textThe actual clauses are drafted at the agency acquisition-office level. A vendor responding to four agencies encounters four versions of the same nominal clause.

Continuous monitoringAgencies shall continuously monitor AI performance and risk over the lifecycle of the system.

GAP

Operational definitionWhat is monitored, how often, against what baseline, escalating to whom, and what triggers a re-assessment versus a routine log entry — all undefined.

Data qualityAgencies shall ensure data used for AI is of sufficient quality for the intended use.

GAP

Quality thresholdWhat "sufficient" means is not defined. Programs that have a quality threshold wrote it themselves; programs that don't are operating without one.

The translation problemEvery row here is a place where the policy framework is doing what policy frameworks do — defining principle and authority. The implementation work each principle implies is the work the agencies are now improvising, in parallel, with limited coordination across agencies.

Six representative pairings between OMB AI-guidance requirements (M-24-10 / M-25-21 / M-25-22 lineage) and the operational specifications implementation requires. The pattern repeats across most named requirements in the memos.

FCI Advisory analysis of OMB memos and agency implementation patterns, FY24-Q4 through FY26-Q1

The memos require agencies to inventory their AI uses. They do not define the granularity. Does one chatbot deployment with three downstream tools count as one use case or four? The agencies are deciding individually, and the inventories at ai.gov already reflect the variance. The memos require agencies to designate high-impact AI for elevated controls. They do not define the determination procedure. The agencies are inventing the procedure. The memos require AI Impact Assessments. They do not say how those assessments relate to Privacy Impact Assessments under the E-Government Act, FISMA system authorization packages, or the records management determinations under NARA-approved schedules. The agencies are reconciling them program by program.

Procurement is the area where the gap is most visible right now. M-25-22 requires AI-specific contract clauses covering performance, data rights, training-data provenance, model updates, and disposition of agency data on contract end.³ The implementing clauses — the actual text a contracting officer puts into a solicitation — are still being drafted by acquisition offices, and the drafts are not converging. A federal vendor responding to four agencies' AI procurements in the current cycle will encounter four different versions of the clauses that the same memo nominally produced.

The implementation gap, sorted

Federal AI requirements do not all sit on the same point on the defined-to-undefined spectrum. Sorting them reveals the shape of the implementation problem.

Chart 02 · The implementation gap, mapped

Requirements sort into four implementation territories.

The well-specified bucket is small. The under- and unspecified buckets carry most of the load.

◀Defined / low variance

Undefined / high variance▶

Well-specified

~4 requirements

Explicit memo text or clear extension from existing federal practice. Low variance.

CAIO designation and authority
Annual use case inventory publication
High-level governance structure
Public posting of agency AI strategies

Underspecified

~9 requirements

Memo points at it and stops. Agencies extrapolate from analogous federal practice.

Pre-deployment testing depth
Continuous monitoring cadence
AI Impact Assessment ↔ PIA / FISMA reconciliation
"Minimum practices" operational definition
Data quality threshold for AI use

Actively contested

~3 requirements

Agency legal offices interpret the same requirement differently. De facto precedent forming.

Rights-and-safety / administrative boundary
Procurement-clause language
Waiver and exception scope

Missing entirely

~5 requirements

No defined operational answer in the memos. Agencies improvising or deferring.

Pre-existing AI transition path
AI-generated records under NARA schedules
Foundation-model vendor due diligence depth
Audit-log retention for agentic systems
Inter-agency AI sharing controls

Where the audit risk livesThe "missing entirely" quadrant is where federal auditors will find agencies in 2027 and 2028. The "actively contested" quadrant is where vendor procurement risk concentrates today. The well-specified bucket is the smallest of the four.

Federal AI requirements categorized by specification clarity (defined ↔ undefined) and implementation variance (stable ↔ contested). Counts are illustrative of the directional pattern across the M-24-10 / M-25-21 requirement set; the boundary lines between quadrants are not bright.

FCI Advisory framework, derived from federal AI program advisory observation

Some requirements are well-specified. The use case inventory format, the CAIO designation, the high-level governance structure — these have either explicit memo text or clear extensions from existing federal practice. Agencies are implementing them with low variance.

A larger set of requirements is underspecified. Pre-deployment testing depth, continuous monitoring cadence, the operational definition of "minimum practices" for rights-and-safety AI, the relationship between AI Impact Assessments and adjacent assessment regimes — the memos point at these and stop. Agencies that have implemented them have done so by extrapolating from analogous federal practice (FISMA controls, privacy assessment procedure, IT investment review), but the extrapolations are not standardized.

A smaller set of requirements is missing entirely. The memos do not specify what to do with AI that was already deployed before the framework existed. They do not specify how AI-generated content should be classified under NARA records schedules. They do not specify acceptable vendor due diligence depth for foundation-model providers. These gaps are real, and the agencies most exposed to them are the ones with mature AI deployments predating the memos.

A handful of requirements sit in active contention. The boundary between "rights-and-safety" AI and administrative AI — load-bearing in M-24-10 and substantially reframed in M-25-21 — is interpreted differently by agency legal offices. The procurement-clause language sits in this category. The agencies that resolve these in the absence of unified guidance are creating de facto precedent that will be hard to unwind once it propagates.

Where agencies actually are on the requirements

Across the named M-25-21 requirements, federal agency implementation is not uniform. Some are mostly complete. Others are in active rollout. A meaningful share is stuck.

Chart 03 · Agency progress against requirements

Compliance distribution by requirement: done, in progress, stuck.

Well-specified requirements are mostly done. Underspecified ones split. Missing-entirely categories cluster in stuck.

Well-specifiedCAIO designation

92%

Well-specifiedUse case inventory publication

84%

12%

Well-specifiedAgency AI strategy posted

78%

16%

UnderspecifiedAI Impact Assessment procedure

38%

44%

18%

UnderspecifiedContinuous monitoring

22%

48%

30%

ContestedProcurement-clause language

18%

52%

30%

MissingPre-existing AI transition

12%

36%

52%

MissingAI-record disposition under NARA

28%

64%

Mostly done

In progress

Stuck

The slope is the storyReading top to bottom is reading from "the memo says exactly what to do" to "the memo does not say what to do." Completion tracks specification clarity almost perfectly. The agencies are willing; the framework simply does not define the work in the bottom rows.

Implementation status distribution across federal agencies for eight representative M-25-21 requirements. Sorted top to bottom from well-specified to missing-entirely. Share figures reflect FCI's engagement observation; the directional pattern is consistent across the federal estate.

FCI Advisory analysis of federal AI implementation status, FY26-Q1

The pattern is consistent with what the structure of the requirements would predict. The well-specified requirements (CAIO designation, basic inventory publication) are mostly done across the federal estate. The underspecified requirements (AI Impact Assessment procedure, continuous monitoring, data quality thresholds) are partially complete, with agencies in different positions. The missing-entirely categories are where agencies cluster in the stuck column — not because agencies are unwilling, but because the requirements have no defined operational answer.

The variance is most pronounced inside the underspecified category. Two agencies of similar size, similar AI maturity, and similar mission scope can show very different completion states on the same requirement, because the requirement does not say what completion looks like. The agencies whose general counsel offices have been most directly involved in implementation tend to score lower on completion and higher on documentation; the agencies whose CIO offices have led tend to score the inverse. Neither is wrong. The framework simply does not say which posture is correct.

"The memos define what agencies must do. They do not define how. The decisions agencies are making in the absence of guidance — about inventory granularity, assessment procedure, vendor controls, and AI-record disposition — are setting precedent the memo never wrote."

The decisions agencies are making in the absence of guidance

The most consequential federal AI decisions of the current budget cycle are being made in the spaces the memos do not cover. These decisions are not yet policy. They will become policy as soon as the next memo cycle catches up, and by then they will be very hard to change.

Chart 04 · The unwritten decisions

Six decisions agencies are making in the memo's absence.

First-mover agencies are setting the implicit standard. The next memo will write down what is already happening.

DECISION 01

Inventory granularity

Does one underlying model serving four workflows count as one entry or four? Where does a use case begin and end?

Agencies vary widely

DECISION 02

Impact-assessment threshold

At what evidence and what consequence level is an AI use escalated for elevated assessment? Who signs off?

No common threshold

DECISION 03

Vendor due diligence depth

What training-data provenance, model-update disclosure, and bias-evaluation evidence is sufficient for foundation-model vendors?

Drafted ad hoc

DECISION 04

AI-record classification

Which AI-generated artifacts — transcripts, drafts, intermediate outputs — are federal records under NARA schedules, and which are working documents?

Decided program by program

DECISION 05

Audit-log retention

For how long are agent decision logs, prompt histories, and tool-call traces retained? Against what schedule?

No common schedule

DECISION 06

Pre-existing AI treatment

AI built and deployed before the memos — brought into the new regime, grandfathered, or quietly retired?

Mostly deferred

Precedent without policyNone of these decisions are documented in the OMB memos. All of them are being made now. The first agencies through the door are setting the implicit standard for the agencies behind them, and the implicit standard is becoming the de facto policy.

Six categories of consequential implementation decision where the OMB AI guidance is silent. Each is being resolved at agency level in the current cycle. The resolutions will harden into precedent before the next memo iteration formalizes them.

FCI Advisory analysis of federal AI program decisions, FY25-Q4 through FY26-Q1

Six categories of decision are visible. Agencies are deciding the granularity of their AI inventories — whether one underlying model serving four workflows counts as one entry or four. Agencies are deciding the threshold at which an AI use is escalated for elevated assessment. Agencies are deciding the depth of vendor due diligence for foundation-model providers — what training-data provenance documentation is sufficient, what model-update disclosure cadence is required, what is acceptable evidence of bias evaluation. Agencies are deciding which AI-generated artifacts are federal records under NARA schedules and which are working documents. Agencies are deciding the audit-log retention period for agentic systems. And agencies are deciding how pre-existing AI deployments — built before the memos existed — are brought into the new regime, or whether they are exempted.

None of these decisions are documented in the OMB memos. All of them are being made now. The first agencies through the door are setting the implicit standard for the agencies behind them, and the implicit standard is becoming the de facto policy. When the next memo cycle attempts to formalize the answers, it will be writing down what is already happening rather than directing what should happen.

What this rules in and out

Four conditions reshape what federal AI program leadership should be doing through the current cycle and into the next budget year:

The policy framework is the floor, not the ceiling. Programs scoped only to the explicit memo requirements are under-scoped. The implementation gaps are real, they are load-bearing for audit and procurement, and they will not be filled by waiting for the next OMB memo. Programs that scope to current practice rather than current policy ship more robustly and survive the next memo cycle with less rework.
The agencies setting the implementation precedent are the ones to watch. The implicit standards being created right now by the first-mover agencies will propagate outward through procurement language, vendor expectations, and the next OMB drafting cycle. The agencies further behind in the curve will inherit those defaults whether they want to or not. Knowing which agencies are setting which precedents is now a procurement-intelligence requirement.
The procurement clauses are the unfinished business. M-25-22 sets the principle; the clauses remain divergent across agencies. Vendors and agencies operating in this gap are accumulating contractual variance that will need reconciliation. Programs writing AI clauses now should write them with the assumption that the language will be revisited within 18 months — and design contract structures that can accommodate revision without renegotiation.
The "missing entirely" category is where audit risk concentrates. The requirements with no defined operational answer — pre-existing AI, AI records disposition, foundation-model vendor due diligence, audit log retention — are where federal auditors will find the agencies in 2027 and 2028. The programs that document their improvised answers now, and document the reasoning behind them, will fare better than programs that leave the decisions implicit.

The decision

The OMB AI guidance is not failing. It is doing what policy memos are designed to do — set principle, define authority, name responsible roles. What it is not doing is telling federal program teams what to put in a solicitation, how deep to vet a foundation-model vendor, or which AI-generated artifact gets a NARA disposition schedule. Those decisions are being made anyway. They are being made by the agencies that move first, and they are becoming the standard the rest of the federal estate will inherit. The decision for federal AI program leadership is whether to participate in setting that standard or to wait and inherit it. The agencies waiting are not avoiding the decision. They are deferring it to the agencies acting now.⁶

What the OMB AI Guidance Doesn't Tell You About Implementation

What the framework says, and what implementation actually requires

What the memos say, and what implementation actually requires.

The implementation gap, sorted

Requirements sort into four implementation territories.

Where agencies actually are on the requirements

Compliance distribution by requirement: done, in progress, stuck.

The decisions agencies are making in the absence of guidance

Six decisions agencies are making in the memo's absence.

What this rules in and out

The decision

Keep reading.

Federal Agentic AI Is Landing in Workforce Systems First

When Agentic AI Meets Legacy ERP: The Integration Layer Nobody's Writing About

The Documentum Question Federal CIOs Aren't Asking but Should Be

Put this thinking to work.