The federal AI policy framework now has more text in it than at any point in its history. M-24-10 set the original architecture in March 2024.1 M-25-21 reframed it around innovation and public trust in April 2025, superseding much of the prior memo.2 M-25-22 set the procurement baseline alongside it.3 The NIST AI Risk Management Framework and its Generative AI Profile sit underneath as the technical reference layer.4 Every covered agency has a Chief AI Officer. Every agency publishes an annual use case inventory.5 By weight of paper, the regime is now substantial.
By weight of implementation guidance, it is not. The memos define what agencies must do; they do not define how. The translation from policy text to operational practice is being done one program at a time, by agency teams improvising against a framework that does not specify procedure. The patterns emerging across agencies further down the curve are not in the memos. They are in the procurement clauses, the impact-assessment templates, the inventory definitions, and the records-disposition decisions that the policy framework points to but does not contain.
What the framework says, and what implementation actually requires
The OMB guidance is at its strongest when it sets principle. It is at its quietest when it would have to define operational detail.
What the memos say, and what implementation actually requires.
Principle is well-defined. Procedure is not. The translation is done one program at a time.
The memos require agencies to inventory their AI uses. They do not define the granularity. Does one chatbot deployment with three downstream tools count as one use case or four? The agencies are deciding individually, and the inventories at ai.gov already reflect the variance. The memos require agencies to designate high-impact AI for elevated controls. They do not define the determination procedure. The agencies are inventing the procedure. The memos require AI Impact Assessments. They do not say how those assessments relate to Privacy Impact Assessments under the E-Government Act, FISMA system authorization packages, or the records management determinations under NARA-approved schedules. The agencies are reconciling them program by program.
Procurement is the area where the gap is most visible right now. M-25-22 requires AI-specific contract clauses covering performance, data rights, training-data provenance, model updates, and disposition of agency data on contract end.3 The implementing clauses — the actual text a contracting officer puts into a solicitation — are still being drafted by acquisition offices, and the drafts are not converging. A federal vendor responding to four agencies' AI procurements in the current cycle will encounter four different versions of the clauses that the same memo nominally produced.
The implementation gap, sorted
Federal AI requirements do not all sit on the same point on the defined-to-undefined spectrum. Sorting them reveals the shape of the implementation problem.
Requirements sort into four implementation territories.
The well-specified bucket is small. The under- and unspecified buckets carry most of the load.
- CAIO designation and authority
- Annual use case inventory publication
- High-level governance structure
- Public posting of agency AI strategies
- Pre-deployment testing depth
- Continuous monitoring cadence
- AI Impact Assessment ↔ PIA / FISMA reconciliation
- "Minimum practices" operational definition
- Data quality threshold for AI use
- Rights-and-safety / administrative boundary
- Procurement-clause language
- Waiver and exception scope
- Pre-existing AI transition path
- AI-generated records under NARA schedules
- Foundation-model vendor due diligence depth
- Audit-log retention for agentic systems
- Inter-agency AI sharing controls
Some requirements are well-specified. The use case inventory format, the CAIO designation, the high-level governance structure — these have either explicit memo text or clear extensions from existing federal practice. Agencies are implementing them with low variance.
A larger set of requirements is underspecified. Pre-deployment testing depth, continuous monitoring cadence, the operational definition of "minimum practices" for rights-and-safety AI, the relationship between AI Impact Assessments and adjacent assessment regimes — the memos point at these and stop. Agencies that have implemented them have done so by extrapolating from analogous federal practice (FISMA controls, privacy assessment procedure, IT investment review), but the extrapolations are not standardized.
A smaller set of requirements is missing entirely. The memos do not specify what to do with AI that was already deployed before the framework existed. They do not specify how AI-generated content should be classified under NARA records schedules. They do not specify acceptable vendor due diligence depth for foundation-model providers. These gaps are real, and the agencies most exposed to them are the ones with mature AI deployments predating the memos.
A handful of requirements sit in active contention. The boundary between "rights-and-safety" AI and administrative AI — load-bearing in M-24-10 and substantially reframed in M-25-21 — is interpreted differently by agency legal offices. The procurement-clause language sits in this category. The agencies that resolve these in the absence of unified guidance are creating de facto precedent that will be hard to unwind once it propagates.
Where agencies actually are on the requirements
Across the named M-25-21 requirements, federal agency implementation is not uniform. Some are mostly complete. Others are in active rollout. A meaningful share is stuck.
Compliance distribution by requirement: done, in progress, stuck.
Well-specified requirements are mostly done. Underspecified ones split. Missing-entirely categories cluster in stuck.
The pattern is consistent with what the structure of the requirements would predict. The well-specified requirements (CAIO designation, basic inventory publication) are mostly done across the federal estate. The underspecified requirements (AI Impact Assessment procedure, continuous monitoring, data quality thresholds) are partially complete, with agencies in different positions. The missing-entirely categories are where agencies cluster in the stuck column — not because agencies are unwilling, but because the requirements have no defined operational answer.
The variance is most pronounced inside the underspecified category. Two agencies of similar size, similar AI maturity, and similar mission scope can show very different completion states on the same requirement, because the requirement does not say what completion looks like. The agencies whose general counsel offices have been most directly involved in implementation tend to score lower on completion and higher on documentation; the agencies whose CIO offices have led tend to score the inverse. Neither is wrong. The framework simply does not say which posture is correct.
The decisions agencies are making in the absence of guidance
The most consequential federal AI decisions of the current budget cycle are being made in the spaces the memos do not cover. These decisions are not yet policy. They will become policy as soon as the next memo cycle catches up, and by then they will be very hard to change.
Six decisions agencies are making in the memo's absence.
First-mover agencies are setting the implicit standard. The next memo will write down what is already happening.
Six categories of decision are visible. Agencies are deciding the granularity of their AI inventories — whether one underlying model serving four workflows counts as one entry or four. Agencies are deciding the threshold at which an AI use is escalated for elevated assessment. Agencies are deciding the depth of vendor due diligence for foundation-model providers — what training-data provenance documentation is sufficient, what model-update disclosure cadence is required, what is acceptable evidence of bias evaluation. Agencies are deciding which AI-generated artifacts are federal records under NARA schedules and which are working documents. Agencies are deciding the audit-log retention period for agentic systems. And agencies are deciding how pre-existing AI deployments — built before the memos existed — are brought into the new regime, or whether they are exempted.
None of these decisions are documented in the OMB memos. All of them are being made now. The first agencies through the door are setting the implicit standard for the agencies behind them, and the implicit standard is becoming the de facto policy. When the next memo cycle attempts to formalize the answers, it will be writing down what is already happening rather than directing what should happen.
What this rules in and out
Four conditions reshape what federal AI program leadership should be doing through the current cycle and into the next budget year:
- The policy framework is the floor, not the ceiling. Programs scoped only to the explicit memo requirements are under-scoped. The implementation gaps are real, they are load-bearing for audit and procurement, and they will not be filled by waiting for the next OMB memo. Programs that scope to current practice rather than current policy ship more robustly and survive the next memo cycle with less rework.
- The agencies setting the implementation precedent are the ones to watch. The implicit standards being created right now by the first-mover agencies will propagate outward through procurement language, vendor expectations, and the next OMB drafting cycle. The agencies further behind in the curve will inherit those defaults whether they want to or not. Knowing which agencies are setting which precedents is now a procurement-intelligence requirement.
- The procurement clauses are the unfinished business. M-25-22 sets the principle; the clauses remain divergent across agencies. Vendors and agencies operating in this gap are accumulating contractual variance that will need reconciliation. Programs writing AI clauses now should write them with the assumption that the language will be revisited within 18 months — and design contract structures that can accommodate revision without renegotiation.
- The "missing entirely" category is where audit risk concentrates. The requirements with no defined operational answer — pre-existing AI, AI records disposition, foundation-model vendor due diligence, audit log retention — are where federal auditors will find the agencies in 2027 and 2028. The programs that document their improvised answers now, and document the reasoning behind them, will fare better than programs that leave the decisions implicit.
The decision
The OMB AI guidance is not failing. It is doing what policy memos are designed to do — set principle, define authority, name responsible roles. What it is not doing is telling federal program teams what to put in a solicitation, how deep to vet a foundation-model vendor, or which AI-generated artifact gets a NARA disposition schedule. Those decisions are being made anyway. They are being made by the agencies that move first, and they are becoming the standard the rest of the federal estate will inherit. The decision for federal AI program leadership is whether to participate in setting that standard or to wait and inherit it. The agencies waiting are not avoiding the decision. They are deferring it to the agencies acting now.6
MF


